








S E QUE NTIAL 

ANALYSIS 


By 

ABRAHAM WALD 


Late Professor of J\4athematical Statistics 

Columbia University 


DOVER PUBLICATIONS, INC. 

NEW YORK 




Sr/ 




W 



' 3' K UNIVERS'I' 







Copyright © 1947 by Abraham Wal 
All rights reserved under Pan Americ 
International Copyright Conventions. 



Published in Canada by General Publishing 
Company, Ltd., 30 Lesmill Road, Don Mills, 
Xoronto, Ontario. 

Published in the United Kingdom by Constable 
and Conipany. Ltd., 10 Orange Street, London WC 2. 


This Dover edition, first published in 1973, is an 
unabridged and unaltered republication of the 
work originally published by John Wiley 8e Sons. 
Inc., in 1947. 


International Standard Book Number: 0-486-61 579-0 
Library of Congress Catalog Card Number: 73-85900 


Manufactured in the United States of America 

Dover Publications, Inc. 

180 Varick Street 
New York, N. Y. 10014 


QLLAHl 

Q im 

L LIE 

IRARY 

1 

1 

1 

II 

II 

1 

II 

III 


PREFACE 


This book presents the theory of a recently developed method of 
statistical inference, that of sequential analysis. An effort has been 
made to keep the exposition on a level that will make most of the book, 
with the exception of the Appendix, understandable to readers whose 
mathematical background does not go beyond college algebra and a 
first course in calculus. Some knowledge of probability and statistics 
IS desirable for the understanding of the book, although not essential, 
for a brief review is given of the fundamental concepts, such as random 
variables, probability distributions, and statistical hypotheses. 

To facilitate the reading of the book for those who have no advanced 

mathematical training, some concessions are made to generality and 

occasionally even to rigor. Furthermore, mathematical derivations 

of somewhat intricate nature are put into the Appendix, the reading 

of which may be omitted without impairing the understanding of the 
rest of the book. 


This book contains an expanded exposition of the ideas and results 
I published in two technical papers on this subject, one of which 
appeared in 1944 and the other in 1945, as well as some further develop- 
ments. Such developments, for example, are; the discussion of multi- 
valued decisions and estimation in Part III; improvements in the 
^ts for the average number of observations required by a sequential 
test; and limits for the effect of grouping in the binomial case. Some 
re^nt results of M. A. Girshick are included and, in the discussion of 
certain applications m Part II, use is made of some simplifications con- 
tained in a publication of the Statistical Research Group of Columbia 
University dealing with these applications. 

Nearly all tables in the book were computed by the Statistical 
^earch Group of Columbia University while I was a consultant to 
e group. A few sections of my two forementioned publications have 

st'aTtirrZnre^'^ Appendix, vdthout sub- 

I wish to express my indebtedness to Milton Friedman and W. Allen 

March P™P°«ed the problem of sequential analysis to me in 

wiarch, 1943. It was the.r clear formulation of the problem that gave 

e incentive to start the investigations leading to the present 



4 

VI 


PREFACE 


developments. I also wish to express my thanks to the Social Science 
Research Coimcil for their help, which facilitated the publication of 
this book. I am indebted to Mr. Mortimer Spiegelman of the Metro- 
politan Life Insurance Company for his careful reading of the manu- 
script and for making several valuable suggestions. Thanks are due 
also to Mrs. E. Bowker who prepared the manuscript with particiilar 

care. 

A. W. 

CoVurnbia XJniversily 
March, 1947 


CONTENTS 



INTRODUCTION 


PART I. GENERAL THEORY 


Chapter 1. ELEMENTS OF THE CURRENT THEORY OF 
TESTING STATISTICAL HYPOTHESES 


1.1 Rai^om Variabl>es and Probability Distributions 

1.1.1 Notion of a Random Variable 

1.1.2 Cumulative Distribution Function (c.d.f.) of a Random Variable 

1.1.3 Probability Density Function 

1.1.4 Discrete Random Variables 

1.1.5 Expected Value and Higher Moments of a Random Variable 

1.2 Notion of a Statistical Hypothesis 

1.2.1 Unknown Parameters of a Distribution 

1.2.2 Simple and Composite Hypotheses 

1.3 Outline op the Current Procedure for Testing Statistical 

Hypotheses 

1.3.1 The Sample 

1.3.2 The General Nature of a Test Procedure 

1.3.3 Principles for Choosing a Critical Region . ! . . ^ ! . . . , 

1.3.4 Number of Observations Necessary if « and 0 Have Preassigned 

Values 

1.3.5 Testing a Hypothesis Viewed as a Decision between Two Courses 

of Action 


5 

5 

6 
9 

10 

11 

11 

11 

13 

13 

13 

14 
16 

20 

20 


Chapter 2. SEQUENTIAL TEST OF A STATISTICAL 
HYPOTHESIS: GENERAL DISCUSSION 

2.1 Notion of a Sequential Test 

^ ^ Choice of Any Particular Sequential 

2 . 2.1 The OperoItTi^ C*arac<erM<ic Function 

2.2.2 The Average (Expected) Sample Number (ASN) Function of a 

Sequential Test 

2.3 Principles for the Selection of a Sequential Test 

2.3.1 Degree of Preference for Acceptance or Rejection of the Null 
o Q o Ho as a Function of the Parameter e . 

‘^■^.2 Requirements Imposed on the OC Function 


22 

24 

24 

25 
27 


vu 


27 

31 



Vlll 


CONTENTS 


2.3.3 The ASN Function as a Basis for the Selection of a Sequential 
Test 

2.4 The Case when a Simple Hypothesis Is Tested against a Single 
Alternative 

2.4.1 Efficiency of a Sequential Test 

2.4.2 Efficiency of the Current Test Procedure, Viewed as a Particular 

Case of a Sequential Test 


33 

34 

34 

35 


Ckavter S. THE SEQUENTIAL PROBABILITY RATIO TEST 
FOR TESTING A SIMPLE HYPOTHESIS Ho AGAINST 

A SINGLE ALTERNATIVE Hi 


3.1 Definition of the Sequential Probability Ratio Test 

3.2 Fundamental Relations among the Quantities «, /3, A, and B . 

3.3 Determination of the Constants A and B in Practice .... 

3 4 The OC Function of the Sequential Probability Ratio Test 

3.5 The ASN Function of a Sequential Probability Ratio Test 

3 6 Saving in the Number of Observations Effected by the Use op 
THE Sequential Probability Ratio Test instead of the Current 

Test Procedure 

3.7 Lower Limit of the Probability That the Sequential Test Will 
Terminate with a Number of Trials Less Than or Equal to a 

Given Number 


3.8 

3.9 


Truncation of the Sequential Test Procedure 

Increase in the Expected Number of Observations Caused by 
Replacing the Exact Values A(ci, 0) and B(cc, 0) by (1 - 0)/a and 
j5/(l — a), Respectively 


37 

40 

44 

48 

52 

54 

58 

61 

65 


ChaTiter A. OUTLINE OF A THEORY OF SEQUENTIAL 
tests of SIMPLE AND COMPOSITE HYPOTHESES 
AGAINST A SET OF ALTERNATIVES 

70 

4 I Tests op Simple Hypotheses 

70 

4 1 2 Test of a Simple Hypothesis against One-Sided Alternatives . . 72 

413 Test of a Simple Hypothesis with No Restrictions on the Alter- 
native Values of the Unknown Parameters 73 

4.1.4 Application of the General Procedure to Testing the Mean of 

a Normal Distribution with Known Variance 77 

78 

4.2 Tests of Composite Hypotheses 

*70 

4.2.1 Discussion of an Important Special Case 

4.2.2 Outline of the Test Procedure in the General Case ^ 


CONTENTS 


uc 


4.2.3 

4.2.4 


Application of the General Procedure to Testing the Mean of a 
Normal Distribution with Unknown Variance (Sequential <-Test) 83 
A Particular Class of Problems Treated by Girshick 84 


5.1 

5.2 

5.3 


5.4 


5.6 

5.6 


6.1 

6.2 

6.3 

6.4 


PART II. APPLICATION OF THE GENERAL THEORY 

TO SPECIAL CASES 

Chapter 5. TESTING THE MEAN OF A BINOMIAL DIS- 
TRIBUTION (ACCEPTANCE INSPECTION OF A LOT 
WHERE EACH UNIT IS CLASSIFIED INTO ONE 

OF TWO CATEGORIES) 

Formulation of the Problem 

Tolerated Risks of Makino Wrong Decisions 

The Sequential Probabilitv Ratio Test Corresponding to the 
Quantities po, Pi, and ^ 

5.3.1 Derivation of Algebraic Formulas for the Test Criterion 

5.3.2 Tabular Procedure for Carrying Out the Test 

5.3.3 Graphical Procedure for Carrying Out the Test 

The Operating Characteristic (OC) Function L(p) of the Test 

5.4.1 Determination of L(p) for Some Special Values of p 

5.4.2 Determination of Lip) over the Whole Range of p . . ' ^ ^ 

5.4.3 Exact Formula for L(p) When the Reciprocal of the Slope uf the 

Decision Lines Is an Integer 

The Average Sample Number (ASN) Function of the Test 

Observations Taken in Groups 

5»6.1 General Discussion 

5.6.2 Upper and Lower Limits for the Effect of Grouping on the OC 
and ASN Curves . . . 


5.7 Truncation of the Test Procedure 


Chapter 6. TESTING THE DIFFERENCE BETWEEN 
THE MEANS OF TWO BINOMIAL DISTRIBUTIONS 

(DOUBLE DICHOTOMIES) 

Formulation of the Problem 

The Classical Method 

An Exact Non-Sequential Method 

Sequential Test of the Hypothesis That pi ^ pz 

6.4.1 Risks That We Are Willing to Tolerate of Making Wrong Deci- 


sions 


6.4.2 

6.4.3 

6.4.4 

6.4.5 


The ^quential Probability Ratio Test Corresponding to the 
Quantities uq, m, a, and 0 

The Operating Characteristic Curve of the Test 

The Average Amount of Inspection Required by the Test 
Observations Taken in Groups 


88 

89 

90 
90 

92 

93 

95 

95 

96 

98 

99 
101 
101 

103 

104 


106 

107 

107 

109 

109 

110 

113 

114 
116 



X 


CONTENTS 


Chavier 7. TESTING THAT THE MEAN OF A NORMAL 
DISTRIBUTION WITH KNOWN STANDARD DEVIATION 

FALLS SHORT OF A GIVEN VALUE 

7.1 Formulation of the Problem 

7.2 Tolerated Risks of Making Wrong Decision 

7.3 The Sequential Probability Ratio Test Corresponding to the 

Quantities Bq, Oi, a, and 

7.4 The Operating Characteristic (OC) Curve of the Test . . . . 

7.5 The Average Amount of Inspection Required by the Test 


117 

117 

118 
122 
123 


Chavier 8. TESTING THAT THE STANDARD 
OF A NORMAL DISTRIBUTION DOES NOT 

A GIVEN VALUE 


DEVIATION 

EXCEED 


8.1 Formulation of the Problem 

8.2 Tolerated Risks for Making a Wrong Decision 

8.3 The Sequential Probability Ratio Test Corresponding to the 
Quantities < ro , «» and 8 

8.4 The Operating Characteristic (OC) Function of the Test 

8.5 The Average Amount of Inspection Required by the Test . 

8.6 Modification of the Test Procedure When the Population 

Mean Is Not Known 


125 

125 

126 
129 
131 

133 


Chapter 9. TESTING THAT THE MEAN OF A NORMAL 
DISTRIBUTION WITH KNOWN VARIANCE IS EQUAL 

'rn A SPECIFIED VALUE 


9.1 Formulation of the Problem 

9.2 A Sequential Sampling Plan Satisfying the Imposed Require 

A • 4 ^ ^ 

MENTS 


134 


134 


PART III. THE PROBLEM OF MULTI-VALUED 

decisions and estimation 


Chapter 10. 

A SET OF 


the CHOICE OF A HYPOTHESIS FROM 
MUTUALLY EXCLUSIVE HYPOTHESES 
(MULTI-VALUED DECISION) 


10.1 

10.2 


10.3 


Formulation of the Problem 


The General Nature of a 
LECTING A Hypothesis from 
potheses 


Sequential Sampling Plan for Se- 
A Set of Mutually Exclusive Hy- 


Consequences of the Choice of Any Particular Sequential 
Sampling Plan 


138 


139 


140 


CONTENTS 


XI 


10.4 


10.6 


Principles for the Selection op a Sequential Sampling Plan 142 

10.4.1 Dependence of Importance of Possible Wrong Decisions on. the 

Parameter Point $ 142 

10.4.2 The Risk Function Associated with a Given Sampling Plan 142 

10.4.3 The Risk Function and the ASN Function as a Basis for the 

Selection of a Sequential Sampling Plan 143 

10.4.4 The Use of Certain Simple Weight Functions 144 

Discussion op a Special Class of Sequential Sampling Plans 145 


ChapUr 11. THE PROBLEM OF SEQUENTIAL ESTIMATION 

11.1 Principles of the Current Theory of Estimation by Intervals 

OR Sets 

11.2 Formulation of the Problem op Sequential Estimation by 

Intervals or Sets 

11.3 A Special Class of Sequential Estimation Procedures 


151 

153 

156 


A.l 


A.2 


A. 3 


A.4 


A.5 


A.6 


APPENDIX 

Proof That the Probability Is 1 That the Sequential Probability 
Katio Test Will Eventually Terminate 

• * • • • • ^ 

Upper and Lower Limits for the OC Function of a Sequential 

X EST ... 


A. 2.1 
A. 2.2 
A. 2. 3 
A. 2.4 
A.2.5 


A Lemma . 

A Fundamental Identity 

Derivation of Upper and Lower Limits for the OC Function 
Calculation of 6$ and Tja for Binomial Distributions 
Calculation of 6e and ri 9 for Normal Distributions 


Upper and Lower Limits for the ASN Function of a Sequential 
Probability Ratio Test 

A 3 2 General Formulas for Upper and Lower Limits 

A.J.2 Calculation of the Quantities U and for Binomial and Normal 
Oistnbutions 

Fobmdlas for the OC AND ASN Functions 
r.1^ o ^ ^*^*’*’® Number OF Integral Multiples 

OP A L/ONSTANT 

The Characteristic Function and Higher Moments of n . . 

A.5.1 Derivation of Approximate Formulas Neglecting the Excess of 
the Cumulative Sum over the Boundaries 
A.5.2 Derivation of Exact Formulas When a Can Take Only a Finite 
Number of Integral Multiples of a Constant 

Approximate Distribution of r When z Is Normaeey Distributed 
A.6.1 The Case When S = 0 and A Is Finite 

A. 6. 2 The Case When B > 0 and A = oo 

A.6. 3 The Case When B > 0 and A Is Finite . 

A. 6. 4 Some Remarks 


157 

158 

158 

159 
161 

164 

165 

170 

170 

179 


181 

185 

185 

190 

191 

191 

194 

194 

195 



CONTENTS 


xii 


A. 7 Efficiency of the Sequential Probability Ratio Test .... 

A. 8 Determination of an Optimum Weight Function - wie ) in Some 
Special Cases or Testing Simple Hypotheses with No Restric- 
tions ON THE Possible Alternative Values of the Parameters 

A. 8.1 A Class of Cases for Which an Optimum Weight Fxmction 

■wifi) Can Be Determined by a Simple Procedure 

A.8.2 Application to Testing the Means of Independently and Nor- 
mally Distributed Random Variables with Known Variances 

A 9 Determination of Optimum Weight Functions - WaiO ) and ivrio ) 
IN Some Special Cases of Testing Composite Hypotheses . . - 

A.9.1 A Class of Cases for Which Optimum W^eight Functions Waifi) 
and Can Be Determined by a Simple Procedure . . 

A.9.2 Application to Testing the Mean of a Normal Distribution 
with Unknown Variance (Sequential 1-Test) 

Index 


196 

199 

199 

201 

203 

203 

204 
209 


INTRODUCTION 


Sequsntid/l analysis is a method of statistical inference whose charac* 
teristic feature is that the number of observations required by the 
procedure is not determined in advance of the experiment. The deci- 
sion to terminate the experiment depends, at each stage, on the results 
of the observations previously made. A merit of the sequential method, 
as applied to testing statistical hypotheses, is that test procedures can 
be constructed which require, on the average, a substantially smaller 
number of observations than equally reliable test procedures based on 
a predetermined number of observations. 

This book presents the theory of a particular method of sequential 
analysis, the so-called sequential probability ratio test, which was de- 
vised by the author in 1943 mainly for the purpose of testing statistical 
hypotheses. A comparison of this particular sequential test procedure 
with any other (sequential or non-sequential) is shown, in Section A.7, 
to effect the greatest possible saving in the average number of observa^ 
tions. when used for testing a simple hypothesis against a single alter- 
native. The sequential probability ratio test frequently results in a 
saving of about 50 per cent in the number of observations over the 
m^t efficient test procedure based on a fixed number of observations. 

Ihe first idea of a sequential test procedure, i.e., a test for which the 
number of observations is not determined in advance but is dependent 
^ of the observations as they are made, goes back to 

H. F Dodge and H. G. Romig ^ who constructed a double sampling 
procedure. According to this scheme the decision whether or not a 
second sample should be dra\vn depends on the outcome of the obser- 
vations m the first sample. Whereas this method allows for only two 
samples, Walter Bartky devised a multiple sampling scheme for the 
particular case of testing the mean of a binomial distribution.^ His 

TlZ procedure that results from the 

® sequential probability ratio test to this particular 
e reason that Dodge and Romig introduced their double 


InspecUon... ... 



2 


INTRODUCTION 


sampling method, and. Bartky his multiple sampling scheme was, of 
course, the recognition of the fact that they require, on the average, 
a smaller number of observations than “single^* sampling. 

The occasional practice of designing a large scale experiment in suc- 
cessive stages may be regarded as a forerunner of sequential analysis. 
The idea of such chain experiments was briefly discussed by Harold 
Hotelling.® A very interesting example of this type is the series of 
sample censuses of area of jute in Bengal carried out under the direc- 
tion of P. C. Mahalanobis.* Sample censuses, steadily increasing in 
size, were taken primarily for the purpose of obtaining preliminary in- 
formation about the parameters to be estimated. This information 
was then used for designing the final sampling of the whole immense 
jute area in Bengal. 

The problem of sequential analysis arose in the Statistical Research 
Group of Columbia University ® in connection with some comments 
made by Captain G. L. Schuyler of the Bureau of Ordnance, Navy 
Department. Milton Friedman and W. Allen Wallis recognized the 
great potentialities and the far-reaching consequences that sequential 
analysis might have for the further development of theoretical sta- 
tistics. In particular, they conjectured that a sequential test proce- 
dure might be constructed which would control the possible errors 
committed by wrong decisions exactly to the same extent as the best 
current procedure based on a predetermined number of observations, 
and at the same time would require, on the average, a substantially 
smaller number of observations than the fixed number of observations 
needed for the current procedure.® Friedman and Wallis also exhib- 
ited a few examples of sequential modifications of current test pro- 
cedures resulting, in some cases, in an increase of efficiency. It was 
at this stage that they proposed the problem of sequential analysis to 
the author. This gave the incentive for the author’s investigations 
which then led to the development of the sequential probability ratio 

test. 

a flarold Hotelling, “Experimental De.termination of the Maximum of a Func- 
tion," The Annals of Mathematical Statistics, Vol. 12 (1941), pp. 20—45. 

* P. C. Mahalanobis, “A Sample Survey of the Acreage under Jute in Bengal, 
with Discussion on Planning of Experiments," Proceedings of the Snd Indian Sta- 
tistical Ccmference, Calcutta, Statistical Publishing Society (1940). 

6 During World War II the Statistical Research Group operated under a con- 
tract with the Office of Scientific Research and Development and was directed 
by the Applied Mathematics Panel of the National Defense Research Committee. 

« Bartky’s multiple sampling scheme for testing the mean of a binomial distribu- 
tion provides an example of such a sequential test. His results were not known to 
Friedman and Wallis at that time, since they were published nearly a year later. 


INTRODUCTION 


3 


Because of the usefulness of the sequential probability ratio test in 
development work on military and naval equipment, it was classified 
Restricted within the meaning of the Espionage Act. The author was 
requested to submit his findings in a restricted report’ dated Sep- 
tember, 1943.® In this report the sequential probability ratio test is 
devised and the basic theory is given. To facilitate the use of this 
new technique by the Army and the Navy, the Statistical Research 
Group issued a second report in July, 1944, which gives an elementary 
non-mathematical exposition of the applications of the sequential prob- 
ability ratio tet and contains a considerable number of tables, charts, 
and computational simplifications to facilitate applications.® 

Further advances in the theory of the sequential probability ratio 
test were made in 1944. The operating characteristic (OC) curve of 
the sequential probability ratio test for the case of a binomial distri- 
bution was found by Milton Friedman and George W. Brown (inde- 
pendently of each other), and slightly earlier by C. M. Stockman in 
England.*® The author then obtained the general OC curve for any 
sequential probability ratio test.'* A few months later a general 
^eory of cumulative sun^ was developed *’ which gives not only the 
OC curve of any sequential probability ratio test but also the charac- 

ter^tic function of the number of observations required by the test 
and various other results. 


The matenal in the author’s report together with the new advances 
made m 1944 were published by him in a paper, “Sequential Tests of 
Statistical Hypotheses,” m Th^ AnruiU of Mathematical Statistics, June, 
1945. The Statistical Research Group issued a revised edition of its 


'Abraham Wald, “^quential Analysis of Statistical Data: Theory" a reoort 
submitted by the Statistical Research Group, Columbia Univei^iiv Iv, a *i- j 
Mathemati^ Pa„ei National Defense ReseaVch Con^^it’iL 
The restricted classification was removed in May, 1945 
® Harold Freeman, “Sequential Analysis of Statistical Data: AppUcations " a 

MetHoil and qX 

of any Sequ^ntiri Protabmtr^ro Test the Operating Characteristics 

to^the Statistic^ ^3oarch Group, Columbia uTvol^Hy 



4 


INTRODUCTION 


original report. The revised edition includes a discussion of the oper- 
ating characteristic and average sample number curves for various 
applications of the sequential probability ratio test. 

Independently of the development in this country and about the 
same time, G. A. Barnard recognized the merits of a sequential method 
of testing. He treated the problem of double dichotomies, using a 
sequential method of testing which, however, differs from the one that 
results from the application of the sequential probability ratio test. 

This book consists of three parts and an Appendix. Part I contains 
a discussion of the general theory of the sequential probability ratio 
test. Part II discusses applications of the general theory given in 
Part I. These applications are given primarily to illustrate the gen- 
eral theory and to bring out some points of theoretical interest which 
are specific to these applications. Accoi*dingly, computational simpli- 
fications are not stressed much and hardly any tables are given.*® 
Part III outlines briefly a possible approach to the problem of sequen- 
tial multi-valued decisions and estimation. This field is largely un- 
explored and further progress is still a matter of future developments. 
To facilitate the use of the book by readers with no advanced mathe- 
matical training, mathematical derivations of somewhat intricate na- 
ture are included in the Appendix. 

G. A. Barnard, “Economy in Sampling with Reference to Engineering Experi- 
mentation/’ (British) Ministry of Supply, Advisory Service on Statistical Method 
and Quality Control, Technical Report, Series “R,” No. Q.C./R/7. 

w For a more complete and detailed discussion of these applications the reader 
is referred to the revised edition of the publication of the Statistical Research 
Group mentioned before. 


PART /. GENERAL THEORY 


Chapter 1. ELEMENTS OF THE CURRENT THEORY OF 

TESTING STATISTICAL HYPOTHESES 


1.1 Random Variables and Probability Distributions 

1.1.1 Notion of a Random Variable 

The outcome of an experiment or the reading of a measurement is 
usually a variable quantity or, more briefly, a variable, since generally 
it can take different values. For example, repeated measurements on 
the length of a bar will yield, in general, different values. Frequently, 
it will be possible to make probability statements concerning the out- 
come of an experiment or the reading of a measurement. Consider, 
for example, the experiment consisting of the throw of a die whose sides 
are numbered from 1 to 6. Here the outcome of the experiment may be 
any integral number from 1 to 6. Various probability statements regard- 
ing the outcome of the experiment can be made. For example, the prob- 
ability that the outcome will be equal to 5 is equal to 3^, or the prob- 
ability that the outcome will be le.ss than 4 is equal to 34, and so forth. 
Probability statements can also be made about the outcome of the 
following experiment: Suppose that an individual is selected at random 
from a group of 1000 individuals and that his height Is then measured. 
The probability that the height of the selected individual will be less 
than G8 inches is equal to 3'fooo times the number of individuals in the 
group who.se heights are less than 08 inches. 

A variable x is called a random variable if for any given value c a 
definite probability can be ascribed to the event that x will take a value 
less than c. A general class of experiments where the outcome is a 
random variable in the sense of the above definition may be described 
as follows. Consider a class of .V objects (or individuals) and some 
measurable characteristic of these objects, such as weight, diameter, or 
hardness. Suppose that the value x of this characteristic varies from 
object to object m the class. The experiment consists in selecting at 
random one object from the class of V objects, and then measuring 
he value X of the characteristic of the selected object. Random selec- 
lon IS selection of an object in such a way that each object in the 
Class ot N objects has an equal chance of being chosen. Tlie outcome 

5 



6 


CURRENT THEORY OF TESTING HYPOTHESES 


X of such an experiment is a random variable, since a probability can 
be ascribed to the event that x will take a value less than c, for any 
given value c. This probability is, in fact, equal to Nc/N, where Nc 
is the number of objects in the class for which the characteristic undex 
consideration has a value less than c. An interesting special case is 
that in which the characteristic under consideration can take only two 
values. Such a situation arises, for instance, in the case of a manufac- 
tured product where each unit is classified in one of two categories: 
defective or non-defective. We shall ascribe the value 0 to a non- 
defective unit and the value 1 to a defective unit. Then the charac- 
teristic under consideration, i.e., the characteristic of being defective 
or non-defective, can take only the values 0 and 1. Consider a lot 
consisting of N units and let Nd be the number of defectives in the lot. 
If the experiment consists in inspecting a single unit drawn at random 
from the lot, the outcome x of the experiment is a random variable 
which can take only the values 0 and 1. The probability that x *= 0 
is equal to (W — iV’cf)/W, and the probability that x = 1 is equal to 
Nd/N. 

1.1.2 Cumulative Distribution Function (c.d.f.) of a Random Vari- 
able 

Let X be a random variable and denote by F{t) the probability that 
X will take a value less than a given value t. Then Fify is a function 
of t which is called the cumulative distribution function of x. Since 



Fig. 1 

any probability must lie between 0 and 1, we must have 0 ^ F{t) ^ 1 
for all values of t. If ti and t 2 are two values such that then the 

probability that x < ^2 is greater than or equal to the probability that 
X < ti, i.e., F(^ 2 ) ^ F{tx). In other words, F{t) cannot decrease as 
t increases. A typical form of a c.d.f. F{t) is shown in Fig. 1 where t 
is measured along the horizontal axis and F{t) along the vertical axis. 



RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 7 


For any given values a and b (a < b) we can easily derive the value 
of the probability that a ^ x < b from the c.d.f. F(t). In fact, the 
event that x < a and the event that a ^ x < b are mutually exclu- 
sive. Hence, the probability that one of these events will occur is 
equal to the sum of the two probabilities : the probability that x < a 
and the probability that a ^ x < b. Thus, we have 

(1:1) (probability that either x<aora^x<6) 


= (probability that x < a) (probability that a ^ x < h) 

Since the probability that either x<aora^x<his the same as 
the probability that x < b, we obtain, from (1:1), 


( 1 : 2 ) 


F{b) = F{a) + (probability that a ^ x < h) 


Hence, the probability that a ^ x < b is equal to F{b) — F{a). 

A simple interpretation of the c.d.f. F{t) can be given if the random 
variable x is the value of a measurement on an object selected at ran- 
dom from a given group of N objects. As mentioned in Section 111 
in this case the probability that the observed value of x satisfies some 
equality or inequality relationship, such as x = c, or x < c, or a < x 
< b, is equal to the proportion of objects in the group of V objects 
^r which the value of x satisfies the equality or inequality in question, 
thus, F(0 IS simply equal to the proportion of objects in the group 
for which X < t. With this interpretation of probability, the validity 
ot (1:2) becomes self-evident. It merely says this: The proportion of 
objects for which x < b is equal to the proportion of objects for which 

^ . proportion of objects for which a ^ x < b. The group 
of A objects is frequently called population or universe. So far we have 
considered only populations which contain a finite number of objects, 
ouch populations are called finite populations. 

The interpretation of the probability that a certain relation (equaUty 
or mequality) holds as the proportion of objects in the population for 
which the vj,lue of a; satisfies that relation proves useful in many 
instonces and we shall employ it frequently. However, if we restrict 
ourselves to finite populations, such an interpretation is not always 
po.ssible. In fact, the c.d.f. 's which arise from finite populations are 
of a special nature. Suppose that N is the number of objects in the 
p p a ion. Then the random variable a: can take at most N different 
aTcfn'Y ‘ f’ different values x can take, arranged in 

M < irtt^v"! «. < aa < ■ • ■ < a.vr. Clearly, 

— ■ the value of x is the same for several objects, then M < N. 



8 


CURRENT THEORY OF TESTING HYPOTHESES 


The c.d.f . of x will be a step function of the type shown in Fig. 2. The 
distribution function makes exactly M jumps and the magnitude of 
each jump is equal to 1/N or an integral multiple of 1/N. A c.d.f. 
represented by a continuous curve, as shown in Fig. 1, is certainly not 
of this type. Thus, if the c.d.f. is given by a continuous curve, the 
interpretation of probabilities as proportions of a finite population is 
not possible. However, any c.d.f. can be approximated arbitrarily 
closely by a c.d.f. arising from a finite population, if the number N of 
objects in the population is sufficiently large. Thus, any c.d.f. can be 



regarded as a limiting form of a c.d.f. arising from a finite population 
when the number of objects in the population is increased indefinitely. 
This means that if we admit infinite populations ^ (populations with 
infinitely many objects), the interpretation of any probability as a 
certain proportion of an underlying population is always possible. Of 
course, the notion of an infinite population is only an abstraction con- 
structed merely for the purpose of simplifying the theory. To give an 
example of an underlying infinite population, consider a measurement 
on the length of a bar, the outcome of which is regarded as a random 
variable x having a c.d.f. Fit). Then the underlying infinite popula- 
tion may be thought of as an infinite sequence of repeated measure- 
ments on the length of the bar, and the actually observed measurement 
is considered an element drawn from this population. Sometimes the 
underlying population is finite, but the number N of objects in the 

* By an infinite population we mean an ordered infinite sequence of objects, 
0\, O 2 , • • •, ad inf. A certain measurable characteristic of these objects is considered 
and the value x of this characteristic is assumed to vary from object to object. 
By the proportion of objects in the infinite population for which x satisfies a given 
relation (equality or inequality) we mean the limiting value of the corresponding 
proportion in the finite population (Oi, • • • , Ofj) as N increases indefinitely. 


RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS 9 


population is so large that we may find it more convenient to treat 
the problem as if TV were infinity, i.e., as if the population were infinite. 
Suppose, for example, that we are interested in the height distribution 
of all male individuals of age 20 and above living in the United S ates. 
The number of such individuals is so large that considerable mathe- 
matical simplification may be achieved by treating the population of 
such individuals as if it were infinite. 


1.1.3 Probability Density Function 

Let F{t) be the c.d.f. of a random variable x. As we have seen in 


Section 1.1.2, the probability that t — (A>0)is given 

2 2 

by ^ + 2^ ~ ^ 0 ~ 2 } * limiting value /(/) of the ratio 




as A approaches 0, provided that such a lim- 


iting value exists,^ is called the probability density of the random vari- 
able X at the value x == t. The probability density f(t) is a function of 
t and is called the probability density function of the random variable 
X. It follows from the definition of the probability density f{i) that 
for small positive values A the product /(O A is a good approximation 

to the probability that x will lie in the interval i dh — . A probability 

density function does not always exist. If the random variable x is 
discrete, i.e., if x can take only discrete values, the c.d.f. is a step func- 
tion and no probability density function exists. 

The probability that x will take a value within the interval from 
to h < tz) can be obtained by integrating the probability density 
function /(O from h to < 2 ; i.e., the probability in question is given by 



*The existence of the limiting value of £( < + - F(t) 

A may be positive or negative and may approach O in any arbitrary manner, 
existence of tius limiting value implies the existence of tho limiting value of 



10 


CURRENT THEORY OF TESTING HYPOTHESES 


One of the most important probability density functions is the so- 
called normal probability density function, which is given by 


(1:3) 


m = 


- 


V2 


TTO" 


where m and <r are some constant values- If a random variable x has 
a probability density function fit) given by (1 :3), we say that x is 
normally distributed, or x has a normal distribution. The shape of a 
normal curve is shown in Fig. 3, where t is measured along the hori- 
zontal axis and f{t) along the vertical axis. 



Fig. 3 


1.1.4 Discrete Random Variables 

A random variable x is called discrete if it can take only discrete 
values. Any variable which can take only a finite number of different 
values is, of course, a discrete variable. A variable which can take 
infinitely many values may still be discrete. For example, if the vari- 
able X is restricted to integral values, x is discrete. The c.d.f. of a dis- 
crete random variable is a step function, as sho^vn in Fig. 2. Thus, a 
discrete random variable has no probability density function, but 
admits an elementary probability law /(O* where /(O denotes the 
probability that x = t. 

In what follows we shall consider only random variables which 
either admit a probability density function or have a discrete distri- 
bution. By the probability distribution, or more briefly distribution, 
J{t), of a random variable x, we shall always mean the probability 
density function of x, if a probability density function exists. If x is 
a discrete random variable, fit) will denote the probability that x t. 
We shall sometimes refer to the distribution /(/) of x also as the popu- 
lation distribution of x, or the distribution of x in the population. 


NOTION OF A STATISTICAL HYPOTHESIS 


11 


1.1.6 Expected Value and Higher Moments of a Random Variable 

Suppose that x is a random variable which has a discrete distribu- 
tion. Let fit) denote the distribution of x, i.e., fit) is the probability 
that X = t Then the expected value of x, in symbols Eix), is de- 
fined by 

(1:4) E(x) = fit) 

t 

where the summation is to be taken over all possible values t of x. 
Interpreting the probability fit) as the proportion of objects in the 
population for which x = t, we see from (1 ;4) that the expected value 
Eix) of X is the same as the mean value of x in the population. If x 
is a continuous variable which admits a probability density function 
/(Oi then the expected value of x is given by 

Eix) = tfit) dt 

— «o 

The expected value of x is often called also the population mean or 
mean of x. * 

A function 0(x) of a random variable x is itself a random variable. 
For any positive integer r and for any constant c, the expected value 
of (x — cY is called the rth population moment of x referred to the 
value c. Of special interest is the case in which c = Eix). The ex- 
pected value of [x — Eix)Y is called the rth moment of x referred to 
the mean. The second moment referred to the mean, i.e., the expected 
value of [x - Eix)f, is also called the variance of x. The square root 
of the variance is called the standard deviation. 

Consider the normal probability density function 

( 1 : 6 ) /(/) = 

where m and ^ are constants (cr > 0). Let x be a random variable 

whose distribution is given by (1:6). That the expected value of x 

IS then equal to m and the variance of x is equal to < 7 ^ can easily be 
venhed. ^ 

1.2 Notion of a Statistical Hypothesis 

1.2.1 Unknown Parameters of a Distribution 

variable. A statistical problem arises when the 
distribution of a: is not known and we want to draw some inference 

onceming the unknown distribution of x on the basis of a limited 



12 


CURRENT THEORY OF TESTING HYPOTHESES 


number of observations on x. Frequently, the distribution of x is not 
entirely unkno\\'n, i.e., some partial knowledge of the distribution of 
X is available a priori. To illustrate this we shall consider the two 
following examples. 


Example 1. Consider a lot consisting of units of a certain manufactured 
product. Suppose that each unit is classified in one of the two categories, defective 
and non-defectivc. The value 0 is assigned to each non-defective unit and the 
value 1 to each defective unit. One unit is drawn at random from the lot and is 
inspected. The outcome x of this experiment is a random variable which can take 
only the values 0 and 1. Denote by p the proportion of defectives in the lot. 
Then the probability that x = 1 is equal to p and the probability that x «= 0 is 
equal to 1 - p. Thus, if the value of p were known, the distribution of x would be 
completely known. Usually p is unknown and we want to make some inference 
regarding the value of p by inspecting a limited number of units drawn from the lot. 
If p is unknown, we have only partial knowledge of the distribution of x; we know 
merely that x is re.stricted to the values 0 and 1. In this case p is considered an 
unknowm parameter which cun have any value between 0 and 1. We shall also 
say that the distribution of x involves an unknomi parameter p. Thus m this 
example the distribution of x is known except for the value of an unknown para- 
meter p. 

Example 2. Suppose that the length of a bar is measured with an mstrunient 
for which the error of measurement is known to be normally distributed. The 
outcome x of such a measurement is then a normally distributed random variable, 
i.e., the distribution of x is given by the normal density function 


V2 




7r<7 


Usually the mean m and the variance of the distribution are unknown. These 
quantities arc also called the parameters of the normal distribution. The mean m 
can take any real value and can take any positive value. Thus, m this exanaple 
too, the distribution function is known except for the values of the parameters 
u and involved in the distribution function. 


A general situation similar to tliat given in Examples 1 and 2 may 
be described as follows: 'Die functional form of tlie distribuUon function 
is known and merely the values of a finite number of parameters involved 
in (he distribution function are unknown; i.e., the dtslribuHon funetton 
is known except for the values of a finite 7iumber of parameters. In Ex- 
ample 1 the only unknown parameter is the proportion p of defectives 
in the lot. In Itxample 2 there are two unknown parameters, the mean 

p and the variance <r". . . . <. v j 

In what follows we shall assume that the distribution of the random 

variable x is kno^\^l except for the values of a finite number of param- 
eters. 



OUTLINE OF THE CURRENT TEST PROCEDURE 


13 


1.2.2 Simple and Composite Hypotheses 

Let 6 i, • • - , be the unkno^^^l parameters of the distribution of the 
random variable x under consideration. A statement about the values 
of © 1 , • • is called a simple h 7 jpothesis if it determines uniquely the 
values of all h parameters. It is called a composite hypothesis if it is 
consistent with more than one value for some parameter. For ex- 
ample, if there are two unknoum parameters, and 82 , involved in 
the distribution of x, the hypothesis that Oi = 2 and &2 = 4 is a simple 
hypothesis, since it specifies completely the values of the unknown 
parameters. On the other hand, the hypothesis that 0i = 82 is com- 
posite. In Example 1 the statement that the unknown proportion p 
of defectives is equal to .2 is a simple hypothesis. On the other hand, 
the statement that p lies between .1 and .3 is a composite hypothesis. 
In Example 2 the statement that ^ = 3 would be a composite hypoth- 
esis, since it does not specify the value of the unknown variance 

In general, the parameters 61 , • - *, w'ill not bo subject to any 
a priori restrictions; i.e., they may take any values. However, the 
parameters may in some cases be restricted to certain intervals. ' For 
instance, if one of the unknown parameters is the standard deviation, 
this parameter is restricted to positive values. In other cases the 
parameter may be able to take only a finite number of discrete values. 

1.3 Outline of the Current Procedure for Testing Statistical Hypoth- 


1.3.1 The Sample 

Let jc be a random variable and suppose that we wi.sh to test a 
hypothesis concerning the unknown parameters of tlie distribution of 
X. 1 he decision to accept or reject the hypothesis in question is always 
made on the basis of a finite number of obscr\ at ion.s on x. A set of a 
finite number of observations on x is called a sample. The number of 
observations contained in the .sample is called the size of the sample 
We .shall be concerned mostly with the case in which the successive 
observations on x are independent in the probability .sense. The suc- 

nrr^l u ‘ on X are said to be independent in the 

probabi ity sense if the (conditional) probability distribution of the fth 

vaUons ' 2. • • • , n), when the values of the preceding obser- 

vations X,, • ■ are known, is not affected by these values. This 

Zi™ ^om t" observations are 

cis^ed f”"- instance, the case dis- 

cussed in Itxample 1 on page 12. Suppose that two suc<e.s.sive units 

are drawn at random from the lot. Denote by x. the value of T or 



14 


CURRENT THEORY OF TESTING HYPOTHESES 


the first unit and by xq. the value of x for the second unit. The distri- 
bution of xi is clearly given as follows: the probability that Xx = 0 is 
1 — p and the probability that xi = 1 is equal to p. The distribution 
of X. 2 , when the value of Xi is kno^vn, is given as follows: if xi = 0, 
then the probability that X 2 = 1 is equal to pN/(^N 1) and the prob- 
ability that :e 2 = 0 is equal to 1 - {pN/{N - 1)]. On the other hand, 
if xi = 1, the probability that Xa = 1 is equal to {pN — \)/{N — 1) 
and the probability that xa = 0 is equal to 1 [(p^ 1 )/(-^ 1)1- 

Thus, the probability distribution of Xg is affected by the outcome of 
xi. For similar reasons no strict independence can prevail in any other 
case in which the successive observations are drawn from a finite popu- 
lation. However, if the number of objects in the finite population is 
sufficiently large, the dependence is only slight and can be neglected. 

Let X be a discrete random variable, and denote the distribution of 
X by f{i), i.e., /(<) is the probability that x = t. Let Xi, ■ - x„ be a 
set of n independent observations on x. Because of the independence 
of the observations, the probability of obtaining a sample equal to the 
observed one is given by the product 

/(xi)/(x2) ■ ■ • /(^n) 

This product is also called the joint probability distribution of the 
observations xi, •••,x„. 

If X is a continuous random variable admittmg a probability density 
function /(x), then the joint density function of n independent obser- 
vations xi, - • - , x„ on X is given by the product 

/(Xl)/(X2) • ■ • /(a^n) 

1.3.2 The General Nature of a Test Procedure 

Denote by n the number of observations on the basis of which the 
acceptance or rejection of the hypothesis in question is to be decided. 
Any possible outcome of n successive observations is a sample of size n. 
A test procedure leading to the acceptance or rejection of the hypoth- 
esis in question is simply a rule specifying, for each possible sample of 
size n whether the hypothesis should be rejected or accepted on the 
basis of that sample. This may also be expressed as follows; A test 
procedure is simply a subdivision of the totality of all possible samples 
of size n into two mutually exclusive parts, say part 1 and part 2, 
together with the application of the rule that the hypothesis be re- 
jected if the observed sample is contained in part 1 and that the 
hypothesis be accepted if the observed sample is contained in part 2. 
Part 1 is also called the critical region. Since part 2 is the totality oi 



OUTLINE OF THE CURRENT TEST PROCEDURE 15 

all Bamples of size n which are not included in part 1, part 2 is uniquely 
determined by part 1. Thus, choosing a test procedure is equivalent 
to determining a critical region. 

As an illustration, we shall discuss a few examples. Suppose that a 
lot consisting of N units of a manufactured product is submitted for 
acceptance inspection. Assume that each unit is classified in one of 
the two categories; defective and non-defective. The proportion p of 
defectives in the lot is assumed to be unknown. Let po be a value 
between 0 and 1 such that we prefer to accept the lot if the proportion 
p of defectives is ^ po and we prefer to reject the lot if p > po- Sup- 
pose that a sample of n units, dra^vn at random from the lot, is inspected 
and on the basis of this sample a decision is to be made to accept the 
lot or reject it. In other words, on the basis of the inspection of the 
sample of n units a decision is to be made to accept the hypothesis 
P ^ Po or reject it. The critical region generally used in this case is 
defined as follows: The hypothesis that p ^ po is rejected, i.e., the lot 
IS rejected, if, a,nd only if, the proportion of defectives in the observed 
sample of n units exceeds a suitably chosen numerical constant c. 

Another example: Suppose that the length of a bar is measured with 
an instrument for which the error of measurement is known to be 
normally distributed with variance equal to unity. Thus, the outcome 
X of a measurement is a normally distributed random variable with 
mean /x equal to the true length of the bar and variance unity. Let 

tested be the statement that the true length of 

^ equal to a specified value mo- This hypothesis is to be tested 

on the basis of a sample consisting of n independent measurements 
^i» , Xn on the length of the bar. The critical region generally used 

for this purpose is defined as follows: The hypothesis that m = Mo is 
rejected if, and only if, the sample ob.served is such that | x — mq | ^ c 

w ere x denotes the arithmetic mean of the n observations and c is a 
suitably chosen numerical constant. 

There are, in general, infinitely many possibilities for choosing a 
critical region. For instance, in the example just discussed we could 
ave used the median, or the geometric mean, or the harmonic mean, 
or some other mean of the observations instead of the arithmetic mean. 

1 he various critical regions cannot be regarded as equally good and 
the fundamental problem in testing hypotheses is to .set up principles 
or t e proper choice of the critical region. Such principles have been 
advanced by Jerzy Neyman and Egon S. Pearson. In the next section 
we shall discuss briefly the basic idea of the Neyman-Pearson theory. ^ 

• See, for example, J. Neyman and E. S. Pearson, Statistical Research Memmrs 
University Collego, London, VoL I (1936), pp. 1-37. 



16 


CURRENT THEORY OF TESTING HYPOTHESES 


1.3.3 Principles for Choosing a Critical Region 

The principles formulated by Neyman and Pearson for the proper 
choice of a critical region constituted an advance of fundamental im- 
portance in the theory of testing hypotheses. The purpose of this 
section is to indicate briefly the basic idea of the Neyman-Pearson 

theory. 

A simple case of particular theoretical interest arises when only one 
unknown parameter 6 is involved in the distribution of the random 
variable x under consideration, and 6 can take only two values, and 
$ 1 . The basic idea of the Neyman-Pearson theory can be indicated 
even in this simple case. Therefore, in the rest of this section, as 
well as in the following section, 1.3.4, we shall restrict ourselves to 
the case of a single parameter 6 which can take only two values, 

$0 and di. 

For any value 0 of the parameter, let/(x, $) denote the distribution 
of X. We shall denote f(x, do) by /o(x) and /{x, ^i) by /i(x). Suppose 
that it is desired to test the hypothesis that 6 = ©o- We shall refer to 
this hypothesis as the null hypothesis and denote it by Ho- The hy- 
pothesis that e = Oi will be called the alternative hypothesis and will 
be denoted by Hi. Thus, we shall deal with the problem of testing the 
hypothesis Ho against the alternative hypothesis Hi on the basis of 

a sample of n independent observations xi, - • *, Xn on x. 

As a basis for choosing among critical regions the following consider- 
ations have been advanced by Neyman and Pearson: In accepting or 
rejecting Hoy we may commit errors of two kinds. We commit an 
error of the first kind if we reject Ho when it is true; we commit an 
error of the second kind if we accept Ho when Hi is true. After a 
particular critical region W has been chosen, the probability of com- 
mitting an error of the first kind, as well as the probability of commits 
ing an error of the second kind, is uniquely determined. The probability 
of committing an error of the first kind is equal to the probability, 
determined on the assumption that Hq is true, that the observed 
sample will be included in the critical region W. The probability of 
committing an error of the second kind is equal to the probability, d^ 
termined on the assumption that Hi is true, that the observed sample 
■will fall outside the critical region W. For any given critical r^on 
W we shall denote the probability of an error of the first kind by a 
and the probability of an error of the second kind hy 13. 

The probabilities a and (3 have the following important practical 
interpretation: Suppose we draw a large number of samples of size n 
Let M be the number of such samples drawn. Suppose that for each 
of these M samples we reject Ho if the sample is included m W and 



17 


OUTLINE OF THE CURRENT TEST PROCEDURE 


accept Ho if the sample lies outside W. In this way we make M state- 
ments of rejection or acceptance. Some of these statements will in 
general be wrong. If is true and if M is large, the probability is 
nearly 1 (i.e., it is practically certain) that the proportion of wrong 
statements (i.e., the number of wrong statements divided by A/) will 
be approximately a. If Hi is true, the probability is nearly 1 that the 
proportion of wrong statements will be approximately jS. Thus, we 
can say that in the long run the proportion of wrong statements will 
be a if Hq is true and /3 if Hi is true. 

It is clear that one critical region W is more desirable than another 
if it has smaller values of a and Although either a or 0 can be made 
arbitrarily small by a proper choice of the critical region IF, it is im- 
possible to make both a and ^ arbitrarily small for a fixed value of n, 
i.e., a fixed sample size. To illustrate this point, consider the follow- 
ing two extreme cases: (1) W is empty, i.e., we always accept Hq, ir- 
respective of the outcome of the sample. In this case a = 0 and = 1 . 
{2) W IS the totality of all possible samples, i.e., we always reject Hq. 
n this case « = 1 and /3 — 0. If, for some reason, we decide to con- 
sider only critical regions W for which a has a given fixed value, the 
choi^ of W is based on the following principle, introduced by Neyman 
and Pearson: Restricting ourselves to regions W for which o: has a fixed 
value, we choose that one for which /3 is a minimum. 

The quantity « is called the size of the critical region, and the 
quantity 1 - the power of the critical region. A critical region 
^vnich has the highest power in the class of all regions of equal size 
IS a most powerful region. Since minimizing ^ is the same as 
ximizing 1 — Neyman-Pearson principle concerning the 

oice of the critical region \V can be formulated as follows; Restrict- 

g ourselves to regions of a fixed size a, we choose that one which is 
most powerful. 


sample size, the probability /3 is a (single-valued) func- 
Ot a, say p(oe), if a most powerful critical region is used. Thus 
pven the number of observations on which the test is based, one of 

Pp»r^ "IL “ ^ chosen arbitrarily. The Neyman- 

th a the question of this choice open. It is clear 

general small'"‘''n large, 13(c) is in 

SaPv " S'-^^^tly induenced by the 

nartienl of the errors of the first and second kinds in each 

an error'^'^fTK’ example, that the lo.ss caused by 

of the° “ hrst kmd is one dollar an.l the lo.ss caused by an error 

wil be large 0 

"111 be preferable to a large « and a small 0. 



18 


CURRENT THEORY OF TESTING HYPOTHESES 


Neyman and Pearson show that a region consisting of all samples 
(xi, • • •, Xn) which satisfy the inequality 

^ /o(*l)/o(X2) ■■■fo(Xn) 


is a most powerful critical region for testing the hypothesis Hq against 
the alternative hypothesis Hi. The term k on the right-hand side of 
(1:7) is a constant chosen so that the region will have the required 
size a. The reason why the critical region defined by (1:7) is most 
powerful can be indicated as follows: For simplicity suppose that the 
probability distributions under Hq and Hi are discrete. Thus, 
A(^i)/*(^ 2 ) • • • fiM (f = 0, 1) denotes the probability of obtaining a 
sample equal to the observed one. The critical region defined by (1:7) 
can be built up by starting with a sample = {xi^,X 2 ^, — ,Xn^) 


for which 


/i(a^i) • • ’ maximum value. Then a sample 


/o(^l) ' ■ 


/l(^l) * * ’ /l(^n) 


- (Xl^ • • - , x,2) is included for which _ 


takes its 


maximum value in the set of samples which is left after E has been re- 
moved from the totality of all possible samples. In general, after r sam- 
ples E^f ■ ■ ' j have been included in the critical region, a sample E 

is added for which maximum value in the 

/o(^l) • • • /o(^n) 

set of samples (xi, • • - , x^) which are left after E^ , • • - , have been 
removed from the totality of all samples. This construction is con- 
tinued until the size of the region reaches the desired value a. 
at any stage of the construction the last sample included in the critical 
region has the largest probability under Hi per unit probability under 
Ho as compared with any other sample not yet mcluded in the region, 
it can be seen that the probability measure of the critical region under 
Hu i-e., the power of the critical region, is greater than or equal to the 

power of any other region of equal size. 

Let us illustrate the principle for choosing a critical region by appli- 
cation to a simple and familiar cose. Let Ho be the hypothesis that 
X is normally distributed ^\dth mean do and variance unity. Let Hi be 
the hypothesis that x is normally distributed with mean Oi and vari- 


« If X is a discrete variable, it may happen that, at the last stage of the construc- 
tion, at the inclusion of the last sample in the critical region, the size of the region 
increases from a value below a to a value somewhat greater than «. 



OUTLINE OF THE CURRENT TEST PROCEDURE 


19 


ance unity. Assume di > Oq. For testing Hq against Hi we shall have 

to determine the ratio . Since 

/o(^i) • • • fo(Xn) 


Mxi) 


and 


1 

fiM = ‘ 

(27r)2 


1 ~yi^ixa-0o)^ 

/o(x„) = re -=i 


/o(^l) • • * foiXn) = 

(27r)2 

the inequality (1 :7) can be written as 


( 1 : 8 ) 


2 

€ I 
n 

-h2 

e «*• I 


^ k 


Taking the logarithm on both sides of this inequality, we obtain 

- 0 o)^ - J2(x« - 0 ,)^ = ( 0 , _ ^ log k 

Hence 


(1:9) 


. » 

z 

a ^ I 




log A: — in( 0 o^ - 0 ^^) 


01 - ft 


= A:' (say) 


Inequality (1 :9) can be written as 

(1:10) gp) ^ A:' — n0o 


= k" (say) 


^Pfin Of *" such that the critical region 

defined by the inequality (1 :10) has the size « = .05. Since under the 

uted”\^T z^ro variable [r(:r„ - «„)]/„ normally distrib- 

uted with zero mean and variance l/n, we see from a tnKlo 

reiZn’ = 1-64/v/;:. Thus, the most powerful 

region of size .05 consists of all samples for which the inequality 


(1:11) 

holds. 


2(Xa, — 0Q^ 


1.64 

vs 


op^thl'the'::!.';":;/?^^ ^evel- 

use the criticarfegion fr®) L^’r^'t T' 

g ( .11) for testing the hypothesis that 0 = 0 ^^ 



20 CURRENT THEORY OF TESTING HYPOTHESES 

against alternative values e > 9o. A remarkable feature of the region 
given by (1:11) is that it does not depend on the alternative value di. 
In the derivation of (1:11) merely the inequality > 0o was used. 
Hence, the test defined by the region (1:11) is most powerful with 
respect to all alternatives 6 > Oq, i.e., it is a uniformly most powerful 
test when the alternatives are restricted to values greater than Bq. 

1.3.4 Number of Observations Necessary if a and p Have Pre- 
assigned Values 

In the preceding section we assumed that « and the sample size n 
were given and we were looking for a critical region for which was 
a minimum. In this section we shall assume that a and /3 are given 
and our problem is to determine the minimum value of n for which 
the power of the most powerful region of size ot is greater than or equal 

to 1 — /3. . 

Let /?„ denote the probability of an error of the second kind associ- 
ated with a most powerful critical region of size a when the test is 
based on n observations. It can be shown that 0n decreases, or at least 
does not increase, with increasing n. In general, will approach 0 
as n increases indefinitely. Denote by n{cx, /3) the smallest value of n 
for which /3„ ^ 0. If we want a test procedure such that the prob- 
ability of an error of the first kind is equal to ot and the probability 
of an error of the second kind docs not exceed 0, then according to the 
current theory we must draw a sample of size n ^ n(oi, 0). If we use 
a most powerful critical region, we need a sample of size n = nia, 0)- 

1.3.5 Testing a H 3 q)Othesis Viewed as a Decision between Two 
Courses of Action 

It happens freciuently in practice that we have to decide between 
two coiir.scs of action, say action 1 and action 2, and the preference 
for one or the other action depends on the value of an unknoum param- 
eter B of the <listribution of a random variable x. Denote by the 
set of all values of B for which action 2 is not preferable to action 1. 
Thus, for any value 0 not contained in <*> we prefer action 2 to action 1. 
The problem of deciding between these two actions on the basis of a 
sample of « independent observations on x may be formulated as a 
problem of testing the hypothesis II that the tme value of 6 is con- 
tained in the set w. If the test procedure leads to the acceptance of 
H we take action 1, and if it leads to the rejection of H we take ac- 
tion 2. . . P 

Consider, for example, the following problem. A lot consisting ot a 

large number of units of a manufactured product is submitted for 


OUTLINE OF THE CURRENT TEST PROCEDURE 


21 


acceptance inspection. Suppose that the proportion p of defectives 
in the lot is unknown. There are two courses of action : acceptance of 
the lot and rejection of the lot. In general, there will exist a particular 
value p' of p such that if the true proportion of defectives is < p' we 
prefer acceptance and if p > p' we prefer rejection. If p = p' we are 
indifferent which action is taken. Suppose that a decision is to be 
made on the basis of a sample of n units drawn at random from the 
lot. This problem may be viewed as a problem of testing the hypoth- 
esis H that p ^ p' on the basis of a sample drawn from the lot. The 
lot is accepted or rejected according as /f is accepted or rejected. 

As mentioned in Section 1.3.3, the choice of a, i.e., the size of the 
critical region, is greatly influenced by the relative importance we 
attach to errors of the first and second kinds. If the problem of test- 
ing a hypothesis arises out of the problem of deciding between certain 
two courses of action, the relative importance of the errors of the first 
and second kinds may be judged by considering the practical conse- 
quences of taking one action when the value of the parameter is such 
that the other action would have been preferable. 



Chapter 2. SEQUENTIAL TEST OF A STATISTICAL 
HYPOTHESIS: GENERAL DISCUSSION 


2.1 Notion of a Sequential Test 

In the current theory of testing hypotheses the number of observa- 
tions, i.e., the size of the sample on which the test is based, is treated 
as a constant for any particular problem. An essential feature of the 
sequential test, as distinguished from the current test procedure, is 
that the number of observations required by the sequential test de- 
pends on the outcome of the observations and is, therefore, not pre- 
determined, but a random variable. 

The sequential method of testing a hypothesis H may be described 
as follows. A rule is given for making one of the following three deci- 
sions at any stage of the experiment (at the vxtYi trial for each integral 
value of m) : (I) to accept the hypothesis II, ( 2 ) to reject the hypothesis 
H, ( 3 ) to continue the experiment by making an additional observa- 
tion. Thus, such a test procedure is carried out sequentially. On the 
basis of the first ob.servation one of the aforementioned three decisions 
is made. If the first or second decision is made, the process is termi- 
nated. If the third decision is made, a second trial is performed. 
Again, on the basis of the first two observations one of the three deci- 
sions is made. If the third decision is made, a third trial is performed, 
and so on. The process is continued until either the first or the second 
decision is made. The number n of ob-servations required by such a 
test procedure is a random variable, since the value of n depends on 
the outcome of the observations. 

For each positive integral value m, we shall denote by the to- 
tality of all possible samples (.ri, • ■ ■ , x,„) of size m. We shall also 
refer to il/m as the /K-diinensional sample space. A rule for making 
one of the three decisions at any stage of the experiment can be de- 
scribed as follows. For each integral value m, the m-dimensional sample 
space is split into three mutually exclusive parts, ^ and R,n- 

After the first observation Xi has been drawn, the hypothesis H tliat 
is being tested is accepted if X\ lies in R\^‘, H is i-ejected if x\ lies in 
or a second observation is made if Xi lies in R^. If the third 
decision is made and a second observation x^ drawn, H is accepted, 
H is rejected, or a third observation is drawn, according as the ob- 
served sample (xi, X2) lies in Rz^ , or /?3. If (xj, X2) lies in R^/ 



NOTION OF A SEQUENTIAL, TEST 


23 


a third, observation is drawn and one of the three decisions is made 
according as (xj, xa, X3) lies in or R^, and so on. This process 

is stopped when, and only when, either the first or the second decision 
is made.^ Thus, a sequential test is completely defined by defining 
the^sets Rm^ , Rm^> and for all positive integral values m. Since 
, Rm , and R^ are mutually exclusive and add up to the whole 
sample space Aim, it is sufficient to define any two of the sets Rm^, 
Rm^, and Rm. Any one of the three sets Rm^, Rm^, and Rm consists 
precisely of all those samples which are not contained in the other two. 

We shaU call a sample (xi, - - ■ , Xm) ineffective if it contains an initial 
segment (xi, • - Xm'), where m' < m, such that (xj, - • •, Xm') lies in 

Rm' or in Rm'^. A sample which is not ineffective will be said to be 
an effective sample. Clearly, for a sequential test procedure we shall 
have an effective sample at any stage of the experiment. Thus, in 
defining the sets Rm^, and Rm we may disregard ineffective sam- 

ples. In other words, it is sufficient to state in which of the sets Rm^, 
Rm , and each effective sample (xi, ••-,x„) should be included' 

samples cannot occur during the sequential process. 

Ihe following is a simple example of a sequential test. Suppose that 
a lot consisting of a large number of units of a manufactured product 
is^ submitted for acceptance inspection. Each unit is classified in one 
ot the two categories: defective and non-defective. The proportion p 
of defectives m the lot is unknown. The lot is considered acceptable 
1 P = a given value p\ U p > p' we prefer to reject the lot Thus 
we are interested in testing the hypothesis H that p ^ p'. The follow- 

mg procedure of testing H is a. simple example of a sequential test, 
bet no denote a given integer. If the first no units inspected are non- 
dcfective, we stop inspection and the lot is accepted (H is accepted). 

nn f"" inspected is found defective, 

uwpected and the lot Ls rejected (H is rejected), 
e shall j^sigm the value 0 to any non-defective unit and the value 1 

" - —pie (x. ■■■,.„) is ef- 


R 


o 


fective if and only if m ^ and = • • • = ^ = n 

LT;‘< 10- - < «o, i.e., aecept;;:ce is\oC;os:^: 

.p contains only one effective sample: (0 0 • • ■ 0) 

one offectiVe sample: 

o sequential tests is that of a proper choice of these sets. To formulate 

probabnuy is „..e that the 


24 


SEQUENTIAL TEST OF A STATISTICAL HYPOTHESIS 

principles for a proper choice of the sets and Rm* '“t is neces- 

sary to study the consequences of any particular choice. This will be 
done in the next section. 

2.2 Consequences of the Choice of Any Particular Sequential Test 

2.2.1 The Operating Characteristic Function 

After a particular sequential test has been adopted, i.e., a particular 
choice of the sets Rm^, and Rm. (m = 1,2, • • •) has been made, 

the probability that the process will terminate with the acceptance of 
the hypothesis Hq under test depends only on the distribution of the 
random variable x under consideration. As before, it is assumed that 
the distribution of x is known except for the values of a finite number 
of parameters, 0*, say. Thus, the distribution of x is given 

by a function /(.r, Ox, • • • 0k) where the functional form/ is known, but 
the true values of the parameters dx, • ■ • , are unknown. To simplify 
notation, we shall use the letter 0 without subscript to denote the set 
of all h parameters Oi, • ■ * , 0k~ ^Ve shall refer to 0 as a parameter point, 
since 6 can be represented geometrically by a point with the coordi- 
nates 01, Since the distribution of x is determined by the 

parameter point 0, the probability of accepting Hq will be a function 
of 0, This function will be denoted by L(0) and will be caUed the 
operating characteristic (OC) function. If there is only one unknown 
parameter 0 the function L(0) can be plotted as a curve, 0 being meas- 
ured along the horizontal axis and Z/(0) along the vertical axis. Since 
we shall consider only tests for which the probability that the proce- 
dure will eventually terminate is equal to 1, the probability of reject- 
ing Hq is equal to 1 — L‘{0). 

Xlie OC function is very closely related to the notion of the power 
function in the current theory of tests. For any parameter point 0 
which is not consistent with the null hypothesis Hq, the power of the 
test is defined as the probability of rejecting Hq when 0 is the true 
point. Tims, for any 0 not consistent with Hq the power of the test 

is equal to 1 — 

To illustrate the meaning of an OQ function, we shall compute the 
OC function of the particular sequential test given as an example in 
the preceding section. In that example the only unknown parameter 
is 0 = p, where p denotes the proportion of defectives in the lot. The 
lot is accepted if, and only if, the first no units inspected are non- 
defective. The probability that the first unit inspected is non-defective 
is equal to 1 — p. On the assumption that the size of the lot is suf- 
ficiently large as compared with no, the successive observations may 



CONSEQUENCES OF THE CHOICE OF A TEST 25 

be treated as being independent. Then the probability that all tiq 
units will be non-defective is equal to (1 — p)”®. Thus, the operating 
characteristic function is given by 

Up) = (1 - 

This function can be plotted, as showTi in Fig. 4, by measuring p 
along the horizontal axis and L(p) along the vertical axis. 



The OC function describes what the sequential test procedure ac- 
complishes. For any parameter point 6 the probability of making a 
correct decision can be obtained immediately from the OC function. 
If the parameter point 0 is consistent with the hypothesis Ho to be 
tested, then the probability of making a correct decision is equal to 
U^). If the true parameter point 6 is not consistent with the hypoth- 
6®is Ho, the probability of making a correct decision is equal to 
1 L>i0). Clearly, an OC function is considered more favorable the 

higher the value of U&) for 0 consistent with Hq and the lower the 
value of L(0) for 0 not consistent with Hq. 

2.2.2 The Average (Expected) Sample Number (ASN) Function of 
a Sequential Test 

We have pointed out before that the number of observations re- 
quired by a sequential test is not predetermined, but is a random vari- 
able, because at any stage of the experiment the decision to terminate 
the process depends on the results of the observations made so far. 
For example, for the particular sequential test discussed in the pre- 
ceding section, the number of observations required by the test may 
be anything from 1 to no- If no defects are found during the sampling 
process, we shall make no observations. On the other hand, if the 
first m — 1 units inspected are non-defective and the mth unit is de- 



26 


SEQUENTIAL TEST OF A STATISTICAL HYPOTHESIS 


fective for some value m < no, then the total number of observations 
made will be equal to m. 

We shall denote by n the number of observations required by the 
sequential test. Then n is a random variable. Carrying out the same 
sequential test procedure repeatedly, we shall obtain, in general, dif- 
ferent values for n. Of particular interest is the expected value of n 
(the average value of n in the long run, when the same test procedure 
is applied repeatedly). For any given test procedure the expected 
value of n depends only on the distribution of x. Since the distribu- 
tion of X is determined by the parameter point 0, the expected value 
of n depends only on 6. For any given parameter point 0, we shall 
denote the expected value of n by E^in). If there is only one unknown 
parameter 6 the function £'e(n) can be plotted as a curve, 0 being meas- 
ured along the horizontal axis and Ee(n) along the vertical axis. We 
shall refer to the average sample number function Ee(n) briefly as the 
ASN function. 

As an example, we shall compute the ASN function for the particular 
sequential test discussed in the preceding section. For any positive 
integral value tn <C Wq) the probability that the test will be terminated 
at the mth observation is given by (1 — We shall inspect n© 

units if and only if the first no — 1 units are found non-defective. 
Thus, the probability that the test will require exactly no observations 
is equal to (1 — Hence, the expected value of n is given by 

no— 1 

Ep(n) — ^ y np(l — p')^~^ + «o(l — p)"® * 

m* 1 

The graph of the ASN function will be of the type shown in Fig. 5. 



An OC function and an ASN function are associated with each test 
procedure. These two functions are perhaps the most important con- 



PRINCIPLES FOR THE SELECTION OF A TEST 


27 


sequences of a test procedure. The OC function describes how well 
the test procedure achieves its objective of making correct decisions, 
and the ASN function represents the price we have to pay, in terms 
of the number of observations required for the test. Thus, in judging 
the relative merits of two different test procedures, we shall compare 
the OC and ASN functions of these two tests. 

2.3 Principles for the Selection of a Sequential Test 

2.3.1 Degree of Preference for Acceptance or Rejection of the Null 
H 5 rpothesis Hq as a Function of the Parameter 0 

In order to set up principles for the selection of a sequential test 
it is necessary to investigate the dependence of the preference for re- 
jection or acceptance of the null hypothesis Hq on the parameter point 
6 . Denote by £o the set of all those parameter points 6 which are con- 
sistent with Hq, i.e., Hq is precisely the statement that the true pa- 
rameter point is included in the set oj. For example, if there is only 
one unknown parameter 0 and if Hq is the hypothesis that 6 is less 
than or equal to a certain particular value Bq, w is the set of all values 
e for which 8 ^ ^o- Since a correct decision is preferred to a wrong 
decision, we can say that acceptance of Hq is preferred whenever 6 is 
in tj, and rejection of Hq is preferred whenever 8 is outside w. 

The mere statement of preference for acceptance or rejection of Hq 

is not yet a sufficient guide for the selection of a proper sequential test. 

For this purpose it is necessary to know something about the degree 

of preference for acceptance or rejection as a function of the parameter 
point 8 . 

We shall denote by w the set of all parameter points which lie outside 
03. A point 8 will be said to be on the boundary of w, or a boundary 
pomt of CO, if any arbitrarily small neighborhood of 8 contains points 
of CO as well as of ci. The totality of all boundary points of co will be 
called the boundary of to. If, for example, there is only one unknown 
parameter and co is defined by 0 ^ then is the only boundary point 
oi 03 If cj IS the set of all values 8 for which Bq ^ 8 ^ 8 ^, then both 8 q 
and 81 are boundary pomts. If the true parameter point 8 lies in to 
but IS near the boundary of < 0 , the preference for acceptance of Hq will 
m general, be only slight. Similarly, if the true point 8 lies in w but 
near the boundary of to, the preference for rejection of Hq will be only 
s ight. In other words, the rejection of Hq is not considered to be a 
serious error if a is in co but near the boundary. Similarly, the accept- 

honnH ” T considered a serious error if 0 is in ci but near the 
boundary of to. If the true pomt 6 lies exactly on the boundary' of co 


28 


SEQUENTIAL. TEST OF A STATISTICAL HYPOTHESIS 


there will be, in general, no definite preference for one or the other 
action, i.e., it will be indifferent to us whether the hypothesis Hq is 
accepted or rejected. 

In general, it ^vill be possible to subdivide the totality of all param- 
eter points (parameter space) into three mutually exclusive zones: a 
zone consisting of all points 6 for which acceptance of Hq is strongly 
preferred; a zone consisting of points 6 for which rejection of Hq is 
strongly preferred; and a zone consisting of all points 6 which are not 
included in either of the first two zones, i.e., the third zone consists 
of all points 6 for which neither acceptance nor rejection of Hq is 
strongly preferred. We shall refer to the first zone as the zone of 
preference for acceptance, to the second zone as the zone of preference 
for rejection, and to the third zone as the zone of indifference. The 
zone of preference for acceptance will always be a subset of <*> and the 
zone of preference for rejection will be a subset of w. The zone of in- 
difference will usually con.sist of points of co and co which are near the 

boundary or on the boundary of a>. 

Although the subdivision of the parameter space into three zones as 
described above is used as a basis for the selection of a sequential test, 
it cannot be considered a statistical problem. Such a subdivisiori is 
made in each case on the basis of practical considerations concerning 


the consequences of a wrong decision. 

The subdivision of the parameter space into the above-mentioned 
three zones gives a somewhat sketchy picture of the degree of pref- 
erence for acceptance or rejection as a function of the parameter 8. 
A more refined description of the degree of preference for one or the 
other action can be given in terms of two functions woie) and Wx{e), 
where tooifi) expresses the relative importance of, i.e., the loss caused 
by, the error of accepting Hq when 6 is true, and Wi id) expresses the 
relative importance of the error of rejecting Hq when 8 is true. The 
function Woid) = 0 for any 8 in a;, since for such points & the accept- 
ance of Hq is a correct decision. For any 8 in oj, Woid) will have a 
positive value which will, in general, increase with increasing distance 
of 8 from the boundary of o). Similarly, Wii8) = 0 for all 0 in w and 
Wy{&) > 0 for all 8 in w. Again, Wyid) will, in general, increase ^vith 
increasing distance of 8 from the boundary of w. Our subdivision of 
the parameter space into three zones may be interpreted as being 
equivalent to choosing the functions woie) and Wyid) as follows: 

= 0 when 8 is in the zone of preference for acceptance or m the 
zone of indifference. For any 8 in the zone of preference for rejection^^ 
Wq{&) has a high positi\'e value, say co, indicating that the loss caused 
by acceptance is of practical importance. Similarly, Wy{8) == 0 for any 



PRINCIPLES FOR THE SELECTION OF A TEST 


29 


6 in the zone of preference for rejection or in the zone of indifference. 
For any 6 in the zone of preference for acceptance, wi(e) has some high 
value, say Cj, indicating that the loss caused by rejection of Hq is of 
practical importance. Although a refined description of the depend- 
ence of the degree of preference for one or the other action on d may 
occasionally require the use of continuous functions ujo(^) and Wi(e), 
the step functions implied by the subdivision of the parameter space 
into three zones \viU give a sufficiently good approximation for most 
practical purposes. Ihey also have the advantage of great simplicity. 
Thus, in what follows we shall assume that the dependence of the 
preference for one or the other action on 0 is described by a subdivision 
of the parameter space into three zones of the type mentioned above. 

As an illustration, we shall discuss briefly a few examples. Consider 
first the case in which a lot consisting of a large number of units of a 
manufactured product is submitted for acceptance inspection. Assum- 
mg that the units are classified in one of the two categories, defective 
and non-defective, the preference for acceptance or rejection of the lot 
depends only on the proportion p of defectives in the lot, which is 
unknown. In this case there is only one unknown parameter 0 which 
is equal to the proportion p of defectives in the lot. It will, in general, 
be possible to select two values po and p, (po < p,) such that for any 
P ^ Po the rejection of the lot is an error of practical importance, for 
any P Pi the acceptance of the lot is considered a wrong decision of 
practical importance, whereas for any value p between po and p, there 
IS no strong preference for either action. Thus, the zone of indifference 
may be defined as the interval from po to pi, the zone of preference for 
acceptance as the set consisting of all values p ^ po, and the zone of 
preference for rejection as the set of all values p > pi. 

As a second example, consider the case in which the hardness x of 
a certain product varies from unit to unit such that a: may be con- 
sidered a normally distributed variable in the population of all units 
produced Suppose that the mean value 0 of a: is unknowm but that 
the standard deviation of x is known. Assume that the most desir- 
able value of 0 is 0o and that the product becomes less desirable as 
the absolute deviation | ^ | between the true mean and the most 

desirable value 0o becomes greater. Suppose tliat the problem is to 
decide whether the product should be put on the market or not In 

'"'***’ i''' possible to find a positive value c 

I I ^ put the product on the market, 

a^a It I ^ j > c we prefer to withhold the product. For \ 0 - 0. \ 

H mav hypothesis 

Ho may be defined as the hypothesis that | ^ | < c. We shall not 


30 


SEQUENTIAL TEST OF A STATISTICAL HYPOTHESIS 

define the zone of indifference by the equation \ 6 — 0o 1 = since if 
I ^ — ^0 I differs only shghtly from c, the preference for one action over 
the other is only slight and of no practical importance. However, it will 
be possible to find a positive value A such that, if ] ^ 1 < c — A, 

we strongly prefer to accept Mq (to put the product on the market) 
and, if 1 d — ^0 I > c + A, we strongly prefer to reject Hq (not to put 
the product on the market) whereas, ifc — A^|0 — 0ol Sc + A, 
no strong preference is given to either action. Thus, the zone of indif- 
ference may be defined by the inequality c — A^l^ — ©ol ^c+A, 
the zone of preference for acceptance by | ^ | < c — A, and the 

zone of preference for rejection by | ^ 1 > c A. 

In each of the previous two examples there was only one unkno^vn 
parameter. We shall now consider an example where there are two 
unkno\\’n parameters. Suppose that a lot consisting of a large num- 
ber of units of a manufactured product is submitted for acceptance 
inspection. Assume that the characteristic of the product in which 
we are interested is the resistance to pressure, which is a measurable 
quantity x. It is assumed that x varies from unit to unit in the lot 
and has a normal distribution with unknown mean m and unkno^vn 
standard deviation <r. Let L be a value such that acceptance of the 
lot is strongly preferred if the proportion of units in the lot with 
resistance x ^ L does not exceed .01, rejection of the lot is strongly 



preferred if the proportion of units in the lot for which x ^ L excels 
.05, and no strong preference exists for either action if the proportion 
of units in the lot with x ^ L lies between .01 and .05. The propor- 
tion of units with x ^ L is greater than or equal to .05 if, and only if, 
(m — V)/a g Xi, and the proportion of such units is ^ .01 if, and 

— IS) /a ^ X 2 (Xi < X 2 ). The values Xi and X 2 can be obtained 
from a table of the normal distribution. Thus the zone of preference 



PRINCIPLES FOR THE SELECTION OF A TEST 


31 


for rejection is given by the set of all values n and <r for which 
(m — L)/<t ^ Xi, the zone of preference for acceptance is given by 
(m — ^ ^ 2 , and the zone of indifference is given by Xi < (/x ~ L)/cr 

< \ 2 . These three zones are represented in Fig. 6, where m is measured 
along the horizontal axis and <t along the vertical axis. The zone of 
indifference is bounded by two straight lines which go through the 
point L on the abscissa axis and have slopes l/Xj and 1 /X 2 , respectively. 

2.3.2 Requirements Imposed on the OC Function 

Suppose that the hypothesis Hq to be tested states that the true 
parameter point 6 lies in a given set w of parameter points. Then we 
wish to make the probability of accepting Ho as high as possible when 
d lies in w, and as low as possible when 6 is outside w. Since the prob- 
ability of accepting Hq is by definition equal to the OC function L{e)j 
an OC function is considered more desirable the higher the value of 
L{d) for any 0 in w and the lower the value of L{d) for any 6 outside oj. 
An ideal OC function would be given by a function L{d) such that 
Lie) = 1 for any 0 in and Lie) = 0 for any 6 outside to. Suppose, 
for example, that there is only one unknown parameter e and the 
hypothesis to be tested is the statement that e ^ 60 . Then, an ideal 
OC function, as shown in Fig. 7, would be given by a function Lie) 
such that Lie) = 1 for 0 ^ ^0 and Lie) = 0 for ^ 



Fio. 7 


. OC function can never be achieved on the 

information about 0 supplied by a random sample 
drawn from the population, but it can be approached arbitrarily closely 
If we are willmg to take a sufficiently large sample. 

th function is to the ideal function and the smaller 

the seaupnt iT ""f °‘"®^*'"'^tions required, the more desirable is 

the sequential test. These two desirable features of a test are some- 


32 


SEQUENTIAL TEST OF A STATISTICAL HYPOTHESIS 

what in conflict, since the closer we approach the ideal form of the 
OC function, the larger, in general, will be the number of observations 
required by the test. To achieve a compromise between these two 
conflicting desiderata, we may proceed as follows. First we formulate 
requirements concerning the closeness of the OC function to the ideal 
function and then consider only tests which satisfy these requirements. 
From these tests we try to select one for which the expected number 
of observations required by the test is as small as possible. To impose 
the desired conditions on the OC function first and then to minimize 
with respect to the expected number of observations does not seem to 
be an unreasonable procedure, since the OC function is perhaps of 
primary importance. 

To formulate requirements on the OC function, we shall make use 
of the subdivision of the parameter space into the three zones discussed 
in the preceding section. Since in the zone of indifference there is no 
strong preference for one or the other action, we shall not impose any 
conditions on the behavior of within the zone of indifference. In 

the zones of preference for acceptance and rejection the requirements 
on the OC function may reasonably be stated as follows. For any $ 
in the zone of preference for acceptance the probability of rejecting 
the h^'^pothesis //o, i-c-) the value of 1 — should be less than or 

equal to a preassigned value ct, and for any 6 in the zone of preference 
for rejection the probability of accepting /fo> i-C-t the value of 
should be less than or equal to a preassigned value 

We can summarize the requirements imposed on the OC function 
as follows. First the parameter space is subdivided into three mutually 
exclusive zones: a zone of preference for acceptance, a zone of prefer- 
ence for rejection, and a zone of indifference. Then two positive values 
« and (3, both < 1, are selected. The requirements imposed on the 
OC function are then given by the two following conditions: 

( 2 : 1 ) 1 — L{d) ^ <x for any 6 in the zone of preference for acceptance 

(2:2) L{e) ^ for any 6 in the zone of preference for rejection 

Condition (2:1) can also be written as 

(2:3) L{d) ^ 1 — a for any B in the zone of preference for acceptance 

The subdivision of the parameter space into three zones, as well as 
the choice of the values a and /3, is to be made on the basis of 
practical considerations in each particular case. We shall say that a 
sequential test is admissible if it satisfies the requirements (2:2) 

and (2:3). 



PRINCIPLES FOR THE SELECTION OF A TEST 


33 


A typical OC function satisfying the conditions (2:2) and (2:3) is 
sho%vn in Fig. 8, where there is only one unknown parameter 6 and the 
zone of preference for acceptance is defined by 0 ^ Oq, and the zone of 
preference for rejection is defined by 0 ^ (0o < ^i-) 



2.3.3 The ASN Function as a Basis for the Selection of a Sequen- 
tial Test 


After the parameter space has been subdivided into three zones and 
the quantities a and ^ have been chosen, we consider only tests which 
are admissible, i.e., tests which satisfy the conditions (2:2) and (2:3). 
Clearly, we wish to select a sequential test for which the expected value 
of the number of observations required by the test is as small as pos- 
sible. This expected value £l?(n) depends, as we have seen in Section 
2.2.2, on the parameter point $. In section 2.2.2 we referred to the 
function Eein) as the ASN function of the test. 

The expected value Eo(n) of the number of observations to be made 
depends, of course, also on the particular sequential test used. To put 
this dependence in evidence, we shall occasionally use the symbol 
Eein I S) to denote the value E^in) when the sequential test>S is applied. 

It is of particular interest to consider for any particular 0 the mini- 
mum 2 value of Ee{n \ S) with respect to S where S may be any admis- 
sible sequential test. This minimum value, in symbols Min Ee(n | >S), 

depends only on $. Clearly, for any admissible sequential test S' we 
have 

Esin I S') ^ Min Be(n I S) 

s 


If an admissible sequential test So exists for .vhich the expected value 
ot the number of observations is minimized for all 6, i.e., for which 

the greatest lower bound with 


34 


SEQUENTIAL TEST OF A STATISTICAL HYPOTHESIS 


Eein 1 iSo) = Min Eein \ S) for all Q, then So may be regarded as a 

s 

“uniformly best" test. In general, however, no uniformly best test 
exists,® i.e., it will not be possible to minimize the expected value of 
the required number of observations simultaneously for all 6. Thus, 
in such cases some compromise principle is to be adopted for the selec- 
tion of a sequential test. We do not propose to enter into a discussion 
of the various possible compromise principles that could be advanced, 
since the various possibilities have not yet been fully investigated. 
However, for the particular, but theoretically very interesting, case 
when a simple hypothesis is tested against a single alternative, the 
situation has been clarified and we shall discuss it in some detail in 
the next section. 

2.4 The Case When a Simple H 3 q)othesis Hq Is Tested against a 
Single Alternative H\ 

2.4.1 Efficiency of a Sequential Test 

We shall consider only two values of the parameter 6, say 9q and d^. 
Let Hq be the hypothesis that 9 = 6q and let H\ denote the hypothesis 
that 6 = di. We shall refer to Ho as the null hypothesis and to Hi as 
the alternative hypothesis. With any sequential test of the hypothesis 
Hq against the alternative hypothesis Hi there will be associated two 
numbers a and ^ between 0 and 1 such that if Hq is true the prob- 
ability is ct that we shall commit an error of the first kind (we shall 
reject Ho), and if Hi is true the probability is ^ that we shall commit 
an error of the second kind (we shall accept Hq). Two sequential tests 
S and S' will be said to be of equal strength if the values a and ^ 
associated with S are equal to the corresponding values a' and as- 
sociated with S'. If « < «' and ^ S. 0', or if a ^ a' and 0 < 0', we 
shall say that is stronger than S' {S' is weaker than S). li ol < ot' 
and 0 > 0’, or if a > and 0 < 0'y we shall say that the strength of 

iS is not comparable to that of S' . 

Restricting ourselves to sequential tests of a given strength (a, 0), 
a test may be regarded as more desirable the smaller the expected 
number of observations required by the test. If S and S' are two 
sequential tests of equal strength such that Ee^{n \ S) ^ Ee^{n \ S') and 
Ee,{n S) < Eg,(n \ S'), or Eo,{n \ S) < Ee,{n \ S') and | S) ^ 

He,(rt S'), the test S will be considered preferable to S'. If a test 
Sq exists such that Ee^{n | .Sq) ^ Ee^in \ S) and He,(n | Sq) ^ E$^{n \ S) 

» The situation here is similar to that in the Ncyman-Pearson theory of testing 
hynothesAs, where uniformly most powerful testa exist only in exceptional cases. 



SIMPLE HYPOTHESIS AGAINST SINGLE ALTERNATIVE 


35 


for all tests S of strength equal to that of So , we shall say that So is an 
optimum test. 

We shall denote by no(o:, the minimum value of | S') with 

respect to S, and by ni{a, 0) the minimum value of Ee^(n \ S) with 
respect to S, where S may be any sequential test of strength {ot, 
Then for any sequential test S of strength (a, 0) we have E 6 (,{n \ S) ^ 
/3) and .£ 9,(71 | 5) ^ n-^{ct,0). A sequential test S of strength 
(or, 0) is an optimum test if E 0 ^{n | S) = no{ot, 0) and Ee,{n [ *S) = 
/3). The existence of an optimum test has not been proved. 
However, it ^\^ll be shown in Section A. 7 of the Appendix that for the 
so-called sequential probability ratio test So of strength {ot, 0 ), defined 
in Chapter 3, the ratios 


(2:4) 


*^o) 

0 ) 


and 


£9,(71 So ) 

Tij (a, 0) 


can exceed 1 only by very small quantities which can be neglected for 
practical purposes. Thus, for all practical purposes, the sequential 
probability ratio test may be regarded as an optimum test.® In Sec- 
tion A. 7 it is also shown that the ratios (2:4) converge to 1 as ap- 
proaches ^o- 

We shall define the efficiency of a sequential test S of strength {ot, 0 ) 


by the ratio 


^o(a, 0 ) 


when Ho is true, and by 


Til {ot, 0) 


when Hi is 


EeM \ S) — Eejr^ 

true. Clearly, the efficiency of a sequential test under Hq, as well as 
under Hu lies always between 0 and 1. The greater the efficiency of 
a sequential test of a given strength the more desirable it is. An opti- 
mum test has the efficiency 1 under Ho, as well as under H^. The se- 
quential probability ratio test for testing Hq against H^ is shown in 
Section A.7 to have an efficiency, if not exactly, very nearly equal to 1 
under //o as well as under Hi. As mentioned before, in Section A.7 
It IS shown that the efficiency of the sequential probability ratio test 
approaches 1 under Hq as well as under //,, when di approaches do- 

2.4.2 Efficiency of the Current Test Procedure, Viewed as a Par- 
ticular Case of a Sequential Test 

The current test procedure may be regarded as a particular case of 
a .sequentud test. In fact, if N denotes the fixed number of observa- 
tions used m the current procedure and if denotes the critical region, 

lol" boLd!'"""'"’ 

optimal™ ‘he sequential probability ratio test is exactly an 

optimum test, but he did not succeed in proving this. 


36 


SEQUENTIAL TEST OF A STATISTICAL HYPOTHESIS 

i.e., Wn is the totality of all those samples of size N for which the 
hypothesis under test is rejected, then the current procedure may be 
regarded as a sequential test defined as follows. For all positive inte- 
gral values m < N, the regions are the empty subsets of the 

m-dimensional sample space and Rm = For m — A^, Rn^ is 

equal to TF^r, is equal to the totality of all samples of size N not 
contained in Rn^j and R^ is the empty set. Thus, for the current test 

procedure we have Eeo(n) = £'flj(n) = N, 

It will be shown later that the efladency of the current test for test- 
ing Ho against Hi, based on the most powerful critical region, is rather 
low. Frequently it is below 3^. In other words, an optimum sequen- 
tial test can attain the same a and 0 as the current most powerful test 
on the basis of an expected number of observations much smaller than 
the fixed number of observations needed for the current most powerful 

test. . xr - 4 . 

In Chapter 3 a simple sequential test procedure for testing Ho agamst 

Hi -will be proposed. It is called the sequential probability ratio test, 

which for practical purposes can be regarded as an optimum sequential 

test. It will be seen that these sequential tests usually lead to average 

savings of about 50 per cent in the number of trials as compared with 

the current most powerful test. 



Chapter 3. THE SEQUENTIAL PROBABILITY RATIO TEST 
FOR TESTING A SIMPLE HYPOTHESIS Hq AGAINST A SINGLE 

ALTERNATIVE 


3.1 Definition of the Sequential Probability Ratio Test 

Let /(x, O') denote the distribution of the random variable x under 
consideration.! Let Hq be the hypothesis that 0 = Bo, and the hy- 
pothesis that 0 = 01 . Thus, the distribution of x is given by f(x, Oq) 
when Ho is true, and by /(x, 0i) when //j is true. We shall denote the 
successive observations on x by xi, X 2 , • • etc. 

As mentioned before, we consider only two cases: (1) x admits a 
probability density function; (2) x has a discrete distribution. It is 
our intention to cover both cases simultaneously. However, the diffi- 
culty arises that some statements will have to be formulated slightly 
differently, depending on whether x admits a density function or x has 
a discrete distribution. This difference in formulation is caused mostly 
by the fact that “probability density” in the continuous case is to be 
replaced by “probability” in the discrete case. For the sake of brevity, 
we shall occasionally use the word “probability” to mean “probability 
density” in the continuous case, if this can be done without danger of 
confusion. With this understanding it will frequently be possible to 
cover the discrete, as well as the continuous, case with a single statement. 

For any positive integral value m the probability that a sample 
, Xm is obtained is given by 

plm = f(Xi, 6i) - • • /(Xm, ^l) 

when Hi is true, and by 


. POm — f(Xi, Oo) • • ■ /(X,„, Oo) 

when Ho IS true. 

The sequential probability ratio test for testing Ho against //. is 
dehned as follows: 1 wo positive constants A and B (B < A) are chosen 
At each stage of the experiment (at the 7nth trial for any integral 
value m), the probability ratio pim/pom is computed. If 


(3:1) 


B < 


yim 


Po 


< A 


m 


'/(x, 6 ) denotes the probability density funetion of x, if a density function exists 

the probability that the random 

variable under consideration takes the value x. 



38 


THE SEQUENTIAL PROBABILITY RATIO TEST 


the experiment is continued by taking an additional observation. If 


(3:2) 


Pi 


m 


POm 


> A 


the process is terminated with the rejection of f£o (acceptance of Hi)- 

If 

Plm 


(3:3) 


> B 


Po 


m 


the process is terminated with the acceptance of 

The constants A and B are to be determined so that the test will 
have the prescribed strength (a, /3). The relations among the quan- 
tities a, A, and B will be discussed in the next section. 

For purposes of practical computation, it is much more convenient 
to compute the logarithm of the ratio pim/pom than the ratio pim/pom 
itself. The reason for this is that log (pim/Pom) can be written as the 

sum of m terms, i.e., 


(3:4) 


, Plm , /(xi,0i) , 

log = log h 


+ log 


f(Xm, 0i) 


Pom fiXi,0o) ■ 

We shall denote the ith term in. this sum by i.e., 


(3:5) 


2 ,- = log 


/(X,-, gi) 
Kxi, do) 


The test procedure is carried out as follows, the quantities Zi (^ 

1 2 • • • ) being used : At each stage of the experiment (at the mth trial 
for each integral value of m), the cumulative sum 2 i H h 2 m is com- 

puted. If 

^3 .( 3 ) log B <. Zi ‘ Zm log A 

the experiment is continued by taking an additional observation. If 

Zi d- • • * + 2 m, ^ log A 

the process is terminated with the rejection of Ho- If 

Zt + • • • + 2»t ^ log B 


the process is terminated with the acceptance of Hq. 

2 If for a particular sample = pom = O. we shall define the value of the ratio 
T.. /nn as 1 If for some sample (xi. ■ • • , x„) we have pim > 0 but pom - 
inequality (3i2) is cousidcivcl fulfilled and Ifo is rejected. 



THE SEQUENTIAL PROBABILITY RATIO TEST 


39 


A few simple illustrations will help to make the procedure more con- 
crete. Suppose that the random variable x can take only two values, 
0 and 1. We shall denote the probability that x = 1 hy p, the value 
of which is assumed to be unknown. Thus, p is the unknown param- 
eter of the distribution. The distribution of x is given by the function 
fix, p) which is defined only for two values of x, namely x = 0 and 
^ = 1 - /(!> p) = P and /(O, p) = 1 — p. Let Hq be the hypothesis 
that p = po and Hi the hypothesis that p = pi (pi po). Then 




/(^n Pi) 
fi^x, Po) 



1 


Hence, 

(3:7) 



2l 


2fn = w* log — — h (m — m*) log — 

Po 1 — Po 


where m* denotes the number of ones in the sequence of the first m 
observations. We accept Ho if 


m* log -i. -f- (m — m*) log — 

Po 1 — Po 


^ log B 


We reject Hq (accept Hi) if 


m* log h (m — m*) log ^ log A 

Po 1 — Po 

We continue the experiment by taking an additional observation if 


log B < m* log (- {m — m*) log ^ < log A 

Po 1 — Po 

The expression (3:7) can, of course, be obtained cumulatively. If an 

observation is a one, the constant log (pi/po) is added to the preceding 

value of (3:7) to obtain the new value. If the observation is a zero 
the constant log (1 - p,)/(i ^ is added. 

As a second example, consider the problem of testing a hypothesis 
about the mean of a noi-mal distribution. Let x be a normally dis- 
nbuted random variable with unknoum mean 0 and unit variance. 


40 


THE SEQUENTIAL PROBABILITY RATIO TEST 

Let Ho be the hypothesis that 6 ^ Oq and Hi the hypothesis that 
e — Oi. Then 

/(x, eo) = ^ 




and 


■V^ 27r 


fix, 0i) = 


•\/ 2ir 




Hence, 


and 


Zt = log *^^ — — ^ = (01 — 0o)xi + — (00^ — 01^} 
f(xt, 00) ^ 


m 


log 


Plm 


If 


Po 


= zj 4- 


m 


t- 1 


m 

2 


m 


y^x,- -1- ^ (00^ - 01^) ^ log A 


the process is terminated with the rejection of Hq. If 


m 




m 


the process is terminated with the acceptance of Hq. If 

m 


m 



the experiment is continued by taking an additional observation. 
Again, log (Pirn/Pom) can be computed cumulatively after each ob- 
servation Xi we compute (0i — 0o}xi }4(0o — 0i ) and add it to the 

preceding value of log (pi»n/pom)* 

3.2 Fimdamental Relations among the Quantities a, p, A, and B 

In this section we shall derive certain inequalities satisfied by the 
quantities <x, {3, A, and B which wHl provide the basis for determmmg 
the constants A and B in the sequential probability ratio test. 

We shall say a sample (xi, • - *, x„) is of type O if 

/(xi, 0i) • - • /(xm, 0i) 


B < 


Pi 


and 


Pom f(Xl, do) • • • /(Xm, ^o) 


< A for m = 1, • • •, n — 1 


Pi 


< B 



RELATIONS AMONG THE QUANTITIES a, A, AND B 


41 


Similarly, we shall say a sample (xi, • • Xn) is of type 1 if 
o ^ Pim /(xi, di) - ■ - /(.r„„ ^l) ^ ^ 

^ < = -7: < A for m = 1, • • • , n — 1 

VOm SiXx, So) • • • do) 

and 


^ A 

POn 


Thus, a sample of type 0 leads to the acceptance of Hq and a sample 
of type 1 leads to the acceptance of Hi (rejection of // q ). 

Clearly, for any given sample (xi, • • , x„) of type 1 the probability 
of obtaining such a sample is at least A times as large under hypothesis 
Hi as under hypothesis Hq. Thus, the probability measure of the 
totality of all samples of type 1 is also at least A times as large under 
Hi as under Hq- The probability measure of the totality of all samples 
of type 1 is the same as the probability that the sequential process will 
terminate with the acceptance of Hi (rejection of Hq). But the latter 
probability is equal to a when Hq is true and to 1 — when Hi is 
true.® Thus, we obtain the inequality 

(3:8) 1 ~ ^ Aa 


This inequality can be written as 
(3 :9) A < 



Thus, (1 — 0)/a is an upper limit for A. 

A lower limit for B can be derived in a similar way. In fact, for 
any given sample (xj, - • ■,x„) of type 0 the probability of obtaining 
such a sample under Hi is at most B times as large as the probability 
of obtaining such a sample when Hq is true. Thus, also the probability 
of accepting //q is at most B times as large when Hi is true as when 
^0 is true. Since the probability of accepting //o is 1 — a when Hq 
is true and 0 when Hi is true, we obtain the inequaUty 

(3 :10) (3 ^ (1 - a)B 


This inequality can be written as 

(3:11) B S — - 

1 — a 

Thus, 0/(1 — oc) is a lower limit for B. 


accepted when is true is by definition equal 
^0. Section A.l of the Appendix shows that the probability is one that the sequen- 
Ual process ^11 eventually terminate. Thus, the probabiUty that Ho will be rejLted 
wnen Hi is true must be equal to 1 — / 3 . 


42 


THE SEQUENTIAL PROBABILITY RATIO TEST 
Inequalities (3:8) and (3:10) can also be written as 


(3:12) 

and 

(3:13) 



These inequalities are of considerable value in practical applications, 
since they furnish upper limits for a. and /3 for given values of A and 
B. For example, it follows from these inequalities that 


1 

(3:14) “-7 

and 

(3:15) ^ ^ B 

It may be of interest to represent graphically the totality of all 
pairs (a, ff) which satisfy the inequalities (3:12) and (3:13). Any pair 



Fia. 9 


(«, ,3) can be represented by a point in the plane with abscissa « and 
ordinate /3. Consider the straight lines Li and in the plane given 

by the equations 

(3:16) oA =* 1 — ^ 

and 

(3:17) ^ 

respectively. The line Li intersects the abscissa axis at a = 
and the ordinate axis at /? = 1. Similarly, the line i >2 intereects the 
abscissa axis at a = 1 and the ordinate axis Q = B. The region 



RELATIONS AMONG THE QUANTITIES a, /3, A, AND B 


43 


consisting of all points (or, 0) which satisfy the inequalities (3:12) and 
(3:13) is the interior and the boundary of the quadrilateral determined 
by the lines Li, L 2 , and the coordinate axes. This region is shown by 
the shaded area in Fig. 9. 

The inequalities (3:12) and (3:13) have been derived under the as- 
sumption that the successive observations Xi,X 2 , •••,etc., are inde- 
pendent observations on x. The assumption of the independence of 
the observations has been used in showing that the probability is one 
that the sequential process will eventually terminate.** The rest of the 
derivation, however, remains valid also when the successive observa- 
tions are dependent, i.e., when the conditional distribution of the ith 


observation x, is affected by the outcome of the preceding observations 
Xi, • • ', x,_i. If the successive observations are not independent, the 
probability that a sample (xj, ■ ■ ■ , x^) will be obtained, i.e., the joint 
distribution of (xi, - * - , x,„), is no longer given by the product 
f(^i, ^)/(x 2 , 0) • • ■ f(xm, 0)j but by a more general function pm(xi, • • • , Xm)- 
Thus, in dealing with dependent observations, the null hypoth- 
esis Ho will be the statement that the distribution of the sample 
(^ 1 , — - ^Xm) is given by some function pom(xi, • •■,x^), and the alter- 
native hypothesis 7/, will be the statement that this distribution is 
given by some other function PimCxj, • ■ x,,,). We can construct the 

sequential probability ratio test for testing Hq against Hi in the same 
way as for independent observations. That is to say, we select two 


constants A and Z? (B <C A) and continue taking observations as long 
^ ^ < - 7~ < A. The fii'st time that the probability 


ratio pim/pom ^ ^ or SB, we terminate the sequential process. Hq 
is accepted if Pim/Pom ^ B and Hi is accepted if pim/pom ^ A. The 
fundamental inequalities (3:12) and (3:13) remain valid for such a test 
procedure in spite of the dependence of the successive observations, 
provided that the probability is one that the procedure will eventually 
terminate. It can be shown that for a very general class of joint di.s- 
tnbutions po^ixi, • - -, x,n) and Pi^nixi, • - -, x„) the probability is one 
that the procedure will eventually terminate. Thus, the validity of 
the inequalities (3:12) and (3:13) is by no means restricted to the case 

of independent observations. They are generally valid also for de- 
pendent observations. 


A simple case of dependent observations arises when we sample from 
a hnite population. Suppose, for example, that a lot consisting of N 
units of a manufactured product is submitted for acceptance inspection 

*Sce Section A.l in the Appendix. 


44 


THE SEQUENTIAL PROBABILITY RATIO TEST 


Let D be the number of defectives in the lot, which is assumed to be 
unknown!. To each defective unit we assign the value 1 and to each 
non-defective unit the value 0. Then the distribution of a single ob- 
servation X is given by /(x, p) w'here/(l, p) = p, /(O, p) = 1 — p, and 
p = D/N, The successive observations are, however, not independ- 
ent. For ejiample, if Xi = 1, the distribution of X 2 is given by 


/^X 2 , ^ distribution of X 2 is given by 

If we denote by d, the number of defectives (the num- 
ber of ones) in the set of the first i observations Xi, • • •, x„ the joint 
distribution of (.ti, • • x,n) is given by ^ 


(3:18) 




D — dm— 1 \ 

N — m 


Suppose that the hypothesis Ha is that D is equal to some specified 
value d>o, and Ux is the hypothesis that D is equal to some value A 
(Z>i > Z^o)* Then the distribution of (xi, • • x„i) under Hq is given by 

/ Da\ { Do - di\ ( Do - dm-i \ 

(3:19) pom N -m + 1/ 

and the distribution under Hi by 

/ Di\ / Di-di\ f Di - dm-i \ 

(3 :20) Pim = / ^3^1, J f ^3r2> ^ _ j / ^^m* ^ ^ _j_ 1/ 

The sequential probability ratio test for testmg Hq agamst //i is b^ed 
on the ratio pim/pom- Inspection continues as long as B < Pim/pom 
< A. The lot is accepted if Pxm/Pom ^ D and the lot is rejected if 
7>im/pom = -^1- The fundamental inequalities (3:12) and (3:13) remain 
valid for this test procedure in spite of the dependence of the obser- 
vations. 


3.3 Determination of the Constants A and B in Practice 

Suppose that we wash to have a test procedure of strength («, ,3). 
Then our problem is to determine the constants A and B such that 
the resulting test wull have the desired strength (a, P). Let us denote 
by A(a, p) and /i(a, P) the values of -4 and B, respectively, for which 
the test has the required strength («, p). The exact determination of 

* This formula Ls valid as long as dm-i ^ Z>- 



DETERMINATION OF THE CONSTANTS A AND B 


45 


the values A(oc, and 5 (a, 0) is usually very laborious.® However, 
the fundamental inequalities derived in the preceding section permit 
an approximate determination of A and B which will suffice for most 
practical purposes. From (3;9) and (3:11) it follows that 


(3:21) 

and 

(3:22) 


A(cc,0) g 

cc 


B(cx, 0) 



1 


— oc 


We shall propose to put ^ = (l - = a{ot, 0), say, and B = 

0/(1 — O') = b(a,0), say, and we shall investigate the consequences 
of this determination of A and B. From (3:21) and (3;22) it follows 
that the value a (a, 0) chosen for A is greater than or equal to the 
exact value A (a, 0), and the value b(a, 0) chosen for B is less than or 
equal to the exact value B{oi, 0). Then, letting A = a(a, 0) instead 
of A(oc, 0) and B = b(oc, 0) instead of B(a, 0) will, in general, change 
the probabilities of errors of the first and second kinds. If A \^■ere 
put equal to a value greater than A(«, /3), and if B were put equal 
to B(oc, 0), then the resulting probability of an error of the first kind 
would be less than a, but the probability of an error of the second kind 
would be slightly larger than 0. Similarly, if we were to use the exact 
value ^ (a;, 0) for A, but a value B below the exact value B(a, 0), the 
resulting probability of an error of the second kind would be less than 
0, and the probability of an error of the first kind would be slightly 
greater than «. Thus, if a value A is used which is higher than the 
exact value A(a,0) and a value B is used which is lower than the 
exact value B(ot,0), it is not clear what the resulting effect on the 
probabilities of errors of the first and second kinds will be. Let us 
denote by a' and 0' the resulting probabilities of errors of the first and 

"'T " = '’(“■'3). From (3:12) and 

(..5:13) it follows that 


(3:23) 

and 



1 


1 — 0' a(oc, 0) 



(3 :24) 


0 


1 — 


CC 


^ Hoc, 0) = 


0 


1 — 


oc 


The results in Section A. 4 of the Appendix can be used for deriving arbitrarih 
close approximation.s to the values A {«, 0) and R(« ^ arbitranlj 


46 


THE SEQUENTIAL PROBABILITY RATIO TEST 
From these inequalities it follows that 


ot 


(3 :25) 

cc' ^ 

1 - 0 

and 


(3 :26) 

< 

“ (1 - «) 


Multiplying (3:23) by (1 - ^)(1 - /3') and (3:24) by (1 - a)(l - a’) 
and adding the two resulting inequalities, we obtain 


(3 :27) a' -h ^ « H- ^ 

Inequalities (3:25), (3:26), and f3:27) give valuable upper limits for 
ct and The values a and 0 will usually be small in practical appli- 
cations. Most frequently they will lie in the range from .01 to 05. 
Thus a/(l — /3) and 0/(1 — «) will be very nearly equal to a and 0, 
respectively. It follows then from (3:25) and (3:26) that the amount 
by which a' may exceed a, or 0' may exceed 0 is very small and can 
be neglected for all practical purposes. Moreover, (3 :27) shows that 

at least one of the inequalities a' < a and 0' ^ 0 
In other words, by using a(c., 0) and b(<x, 0) instead of ^ 

B(oc, /3), respectively, at most one of the probabilities « and 0 may be 


Thus, we may conclude: The -use of a(«, « and Ha, 0) instep of 
A(,a,0) and B(a,0), respectively, cannot result tn any appreciabU %n- 

crease in the value of either a or 0. In other words, for all 

poses the test corresponding to A ^ aia, 0) and B (a, 0) 

least the same protection against wrong dectstons as the test corresponding 

to A = A(ot. 0) and B = B(cc, 0). , ^ 

Our discussion so far leaves still open the possibility that the use 

of «(«, 0) and h{a, 0) instead of A(«, 0) and B(.a 0) respectively, may 

result in an appreciable decrease of a, or 0, or both. If this we ^ 

it would mean only that the test corresponding to A = ^ 

B = 6(«, 0) would provide a better protection against vTong dec^ions 

Uian tL’ tit corresponding to A = A («, « and B = B(a,0y Thus 
the only disadvantage that may arise from using a(a, 0) and 6(a, 0) 
instead of A(a,0) and B(a. 0), respectively, is that it ^ ^ 

an appreciable increase in the number of observations required b> the 
test. In fact, since a(a, 0) S A(or, 0) and 6(«, 0) ^ B(,a 0) tee nurn- 
ber of observations required by the test corresponding te ^ 
and B = Ha, 0) can never be smaller than the number of observations 

required by the test corresponding to A - A (a, 0) and B (a, /S • 

Thus, if the increase in the necessary number of observations cause 



47 


DETERMINATION OF THE CONSTANTS A AND ff 

by the use of a(a, 0 ) and h{oty 0 ) instead of -4 (a, 0 ) and 0 ) can 
be shown to be only slight and of no practical consequence, the test 
corresponding to A = a(«, 0) and B = b{oi, 0) serves the purpose just 
as well, and the determination of the exact values A{ot, 0) and 0) 
is of Little interest. 

^Ve sli3.ll now indic3te the reRsons why the increose in the necess3ry 
number of observations caused by the use of a{a, 0 ) and 6(«, 0 ) instead 
of the exact values A (ex, 0) and B(a, 0) will generally be only slight.’ 
T^e reason that (3:21) and (3:22) are inequalities instead of equalities 
is that the sequential process may terminate mth pim/pom > -4 or 
Pim/pom < B. If at the final stage pim/pom were exactly equal to A 
or B, then A (a, 0} and B(oc, 0) would be exactly equal to (1 — 0)/cx. 
and 0/(1 ~ Of), respectively. On the other hand, a possible excess of 
Pim/Pom over the boundaries A and B at the termination of the test 
procedure is caused only by the discontinuity of the number of obser- 
vations, i.e., by the fact that the number of observations can take only 
integral values. Thus, if fractional observations were possible, i.e., if 
the number m of observations were a continuous variable, Pim/pom 
would also be a continuous function of m and consequently A(ex, 0) 
and ^(a, 0 ) would be exactly equal to a(«, 0 ) and b(ce, 0 ), respectively. 

1 hat the mcrease m the necessary number of trials caused by the use 
ot a(a, and 6(ar, 0 ) will generally be slight is strongly indicated by 
the fact that the discrepancy between A(<x, 0 ) and a(cx, 0 ), as well as 
that between B{c, 0 ) and 6(a, 0 ), arises only from the discontinuity 
of the number of observations. In Section 3.9 we give upper estimates 
ot the mcrease in the expected number of trials caused by the use of 
a(a, « and 6(«. 0 ). Numerical computations given in that section 

‘ncrease is slight. It may be added that the nearer the 
. nbution/fr, 9,) is to the distribution /(x, Bo) the .smaller will be this 
increase m the expected number of trials. The reason for this is that 
nearer f(x, $i) is to /(x, Bq), the smaller the expected excess of 
Pim/P 0 „ over the boundaries A and B and, therefore, also the smaller 
the dLscrepancy between X («, 0 ) and a(«, 0 ) as well as that between 
(or, 0) and b(oc, 0 ) If /(;c, B,) approaches /(x. Bo) the exact values 
0) and B(a,, 0 ) converge to «(«, 0 ) and 6(«, 0 ), respectively. 

PuroTse!’ fh ''^P7‘"'°"t^tion is not excessively costly, for all practical 

/“ii procedure may be adopted: // a sequential test 

ex^eT of an error of the first kind does not 

0 /^t "a t "'■O'- of '^"ond kind does not exceed 

vrohahitUn } r ~ ~ "at the sequential 

P y ratzo test as defined by the inequalities (3:1), (3:2), and (3:3). 

’ For a more complete discassion sec Section 3.9. 


48 


the sequential probability ratio test 

The fact that for practical purposes we may put A = a(a, /3) and 
B = b(o:, /3) brings out a surprising feature of the sequential test as 
compared with current tests. Whereas current tests cannot be carried 
out without finding the probability distribution of the statistic on 
which the test is based, there are no distribution problems in carr 3 dng 
out a sequential test. In fact, o(cii, y3) and 6(a, 0) depend on oc and ^ 
only, and the ratio p\m/PQm can be calculated from the data of the 
problem without solving any distribution problems. Distribution 
problems arise in connection with the sequential process only if it is 
desired to find the probability distribution of the number of trials 
necessary for reaching a final decision. But this is of secondary im- 
portance as long as we know that the sequential test on the average 
leads to a saving in the number of observations. 


3,4 The OC Fimction of the Sequential Probability Ratio Test ® 


Since the sequential probability ratio test for testing the hypothesis 
Hq against the hypothesis H\ ivill be applied to problems when the 
parameter 6 can take values ^ 6q and ^ &i, it is of interest to derive 
the whole operating characteristic function L>{0) of the test. For con- 
venience, we shall treat the case of a single unknown parameter 0 in 
this section and in Section 3.5. The results can be extended without 
difficulty to any number of parameters. In Section 2.2.1, Li0) has 
been defined as the probability that the sequential process will termi- 
nate with the acceptance of Ho when 0 is the true value of the param- 
eter. In this section we shall indicate the derivation of an approxi- 
mation formula for L(d), neglecting the excess of pim/pom over the 
boundaries A and B at the termination of the process. A rigorous 
derivation (using a different method) together with upper and lower 
limits for the OC function will be given in Section A.2.3 of the Appendix. 

Consider the expression 

lf{x,0o)\ 


For each value 0, the value of h(d') is determined so that k(0) ^ 0 
and the expected value of the expression (3:28) is equal to 1, i.e., 

(3:29a) \j^] ^ 

» As mentioned in the Introduction, the operating characteristic function for the 
fecial case of a binomial distribution was found by Milton Fncdrnan and George 
W. Brown independently of each other, and slightly earber P 
England The derivation of the OC function in the general case is due to the author 



THE OC FUNCTION 


49 


if /(x, &) is the probability density function, or 


(3:296) 




if X has a discrete distribution (the summation is taken over all pos- 
sible values of x). It is shown in Section A.2.1 of the Appendix that 
under some slight restriction on the nature of the distribution function 
/(x, 0), there exists exactly one value k{e) ^ 0 such that (3 :29) is fulfilled. 
Hence, for any given value 6 , the function of x given by 

jA<9) 

(3 :30) 

is a distribution function. 

Since k(d) 9 ^ 0, there are two possibilities; h(d) > 0 or h{e) < 0. We 
shall first consider the case when hid) > 0. 

Let H denote the hypothesis that /(x, 6 ) is the true distribution of 
X and the hypothesis that/*(x, 0) is the true distribution of x. 
Consider the sequential probability ratio test S* for testing H against 
H defined as follows : Continue taking observations as long as 


« = [mr-- •> 


(3 :31) 




rix^e) rix^, e) 

■ • • fiXm, 6) 


/(^l. 0 ) • • 

Accept the hypothesis H if 

(3 :32) /*(xi, g) • • • f*ixm, 0) 

fi^i, 0) ' ■ ■ fixm, 0) 

Reject the hypothesis // (accept //*) if 

f*ixu0) --f*ix^.0) 


< A 


h($) 




(3 :33) 
Since 
(3 :34) 


> A 


h(e) 


fi^i, 0) ■ ■ • fixm, 0) 

f* i^, e) ^ Uix, 


fix, 0) L/(x, 0o)\ 


and since A(e) > 0 , the inequalities (3;31), (3:32), and (3:33) are 
equivalent to 


(3:35) 

- /(^i> ^i) * 

a < 

f(^Xtn j 

Si) 


fix 1 , 6(y) • 


Bo) 

(3:30) 

fi^i, 0\) ■ 


Bi) 

and 

fi^\ y 0o) - 


Bo) 

(3 :37) 

fi-^ 1 > ^l) ■ 

m } 

Bi) 


^o) • 

S(.^Tn} 

Bo) 


< A 


— < B 


> A 


60 


THE SEQUENTIAL PROBABILITY RATIO TEST 

But these inequalities are identical with those defining the sequential 
probability ratio test S for testing Hq against Hi, when the constants 
A and B are used. Thus, if the test S* leads to the acceptance of H, 
the test S leads to the acceptance of Hq, and if S* leads to the rejection 
of H, then S also leads to the rejection of Hq. From this, it follows 
that the probabUity of accepting Ho when d is true, i.e., the value of 
Z/(0), is the same as the probability that the test S* will lead to the 
acceptance of H when fix, 6) is the true distribution of x. 

To calculate the latter probability we shall apply the formulas (3:9) 
and (3:11) to the test procedure *S*. Denote by a' the probability that 
will lead to the rejection of H when H is true, and by the prob- 
ability that S* leads to the acceptance of H when H* is true. Apply- 
ing the formulas (3:9) and (3:11) to the test procedure S* we obtain 


(3 :38) 


and 


(3 :39) 




When the excess over the boundaries at the termination of the proc 
ess is neglected, the equality sign holds in (3:38) and (3:39), that is,® 


(3 :40) 



and 

(3:41) 


^h(.e) 






From (3:40) and (3:41) we obtain 

1 - 

(3 :42) “ ~ ^A(e) 


Since a' = 1 - L(0), we get 
(3 :43) ~ 

The case h{d) <0 can be treated in a similar way. We obtain the 
same result, i.e., the approximation formula (3:43) remains valid also 

when h(6) < 0. 

® The symbol indicates an approximate equality. 



THE OC FUNCTION 


61 


It is interesting to note that ^(^o) = 1 and h{$i) — —1. This fol- 
lows easily from (3:296). 

As an illustration, we shall determine L(0) for the binomial case when 
X can take only the values 0 and 1 and the distribution /(x, 6) is given 
as follows: /(I. 6) = S and/(0, 0) = 1 — Then equation (3:296) can 
be written as 



To plot the OC function, it is not necessary to solve equation (3:44) 
with respect to hiff). We may consider h = h{e) a parameter and solve 
(3:44) with respect to 6. Then we obtain 


(3:45) 




If we let A -- (1 — 0)/oc and B = 0/{\ — a), (3:43) can be written as 


(3 :46) 





chosen value h, the point (0, L{e)], computed from 
(^.45) and (3:46), will be a point on the OC function. The OC func- 

plotting a sufficiently large number of points 
\p, corresponding to various values of h. 

A typical OC function for the binomial case is shown in Fig. 10. 



52 


THE SEQUENTIAL PROBABILITY RATIO TEST 


We shall now compute L{d) when x is normally distributed with un- 
known mean 0 and known variance cr^. In this case we have 


fix, 6) 



i 




The quantity h(d) is the non-zero root of the equation 


(3 :47) 


X 


+ « 




- « V2 


TTCT 


-2^ <*“'*)* 


Le 




hid) 


dx = 1 


Evaluating the above integral and solving the equation with respect 


to k{d), we obtain 
(3:48) 


hid) 


di 00 2d 
01 — 00 


An approximation to the OC function is obtained from (3:43) by sub- 
stituting ($1 + ^0 — 2d)/(di — do) for ?ii0). 


3.6 The ASN Function of a Sequential Probability Ratio Test 

Let n denote the number of observations required by the test and 
let Bff(n) be the expected value of n when d is the true value of the 
parameter. This expected value Eein) is a function of d which we have 
called the average sample number function, or briefly the ASN func- 
tion. In this section we shall outline the derivation of an approxima- 
tion formula for the ASN function, neglecting the excess of pir»/pom 
over the boundaries A and B at the termination of the sequential 
process. A more complete discussion together with upper and lower 
limits for the ASN function is given in Section A.3 of the Appendix. 

Let N be an integer sufficiently large to allow the probability that 
n ^ A to be neglected.*® Thus we shall assume that n < N , Then 

we can write 

(3:49) Zi -h h ZN = izi -\ + ^n) + iZn+l H 


where 
(3 :50) 




fjXg, di) 
0o) 


10 It is shown in Section A.3.1 that no error is involved in assuming this, since we 
pass to the limit when N approaches °o . 



THE ASN FUNCTION 


53 


Taking expected values on both sides of (3:49), we obtain 


(3 :51) 
where 
(3:52) 


NE{z^ ~ • 4 - Zn) + EiZn+l + • • • + Zn) 



fix, do) 


Since, for a > n, the random variable Za is distributed independently 
of 71, the expected value of Zn+i H- ■ • • + zisr is equal to the expected 
value of (A^ — n) times the expected value of a single z, i.e., 

(3:53) E(zn-\-i 4- - ■ • -h Zn) = E{N — n)Eiz) = NEiz) — Ein)Eiz). 
From (3:51) and (3:53) it follows that 


(3 :54) 

Hence 

(3:55) 


Eizi H +zn) - Ein)Eiz) - 0 


Bin) = 


Ejzx H 4-gn) 


if E{z) ^ 0. 

If 0 is the true value of the parameter, then E{n) = Eein) by the 
definition of the symbol Ee{n). We shall denote by Egiz) the expected 
value E{z) of z when 6 is the true value of the parameter. If the excess 
of the probability ratio pim/pom over the boundaries A and B at the 
termination of the sequential process is neglected, the random variable 

take only the values log -4 and log ^ with the 
probabilities 1 — L{d) and L{d), respectively. Hence 

(3 :55) 


Bizx 4 \- Zn) ^ Lie) log Z? 4- [1 - Lid)] log A 

From (3.55) and (3:50) we obtain the approximation formula 


(3 :57) 


Eein) 


Lid) log B 4- [1 - Lid)] log A 

Eeiz) 


I ^ preceding section we have computed explicitly the formula 

( ) or the binomial and normal case. Thus, to obtain the explicit 
ormu a for Bein'), w'e need only compute Eeiz). In the binomial case, 
i-e., when fix, d) = e for x == 1 and fix, d) = I ~ d for x = 0, we have 

fjO, 0i) 
fiO, do^ 


(3:5«) = « log-^ +(!-«) log 


fil.do) 


0 


^ “ — b (1 ~ d) log 

do 


1 - 0 


1 - dn 


64 


THE SEQUENTIAL PROBABILITY RATIO TEST 


In the normal case, i.e., when 


we have 
(3 :59) 
Hence, 
(3 :60) 


2 = log 


/(.T, e) = 


0i) 




V2 


7r<r 


fix, 0o) 2<7' 


[2(9i - 0 o)x + 00^ - 01^] 


Eeiz') = —z [2(fli - 0o)0 + 

2tr^ 


3.6 Saving in the Number of Observations Effected by the Use of 
the Sequential Probability Ratio Test instead of the Current 
Test Procedure 

In this section we shall assume that Hq is the hypothesis that the 
random variable x under consideration is normally distributed with 
mean and variance unity, while is the hypothesis that x is nor- 
mally distributed with mean di and variance unity. We may assume 
without loss of generality that 6o <i di. We shall compare the ex- 
pected number of observations required by the sequential probability 
ratio test of strength (a, 0) for testing Ho against Hi with the fixed 
number of observations needed for the current most powerful test to 

attain the same strength (or, 0)- 

We shaU denote by n(«, 0) the fixed number of observations re- 
quired by the current test to attain the strength (a, 0. The current 
most powerful test procedure for testing Hq against Hi is carried out 
as follows. The hypothesis Hq is accepted if the arithmetic mean x of 
the observations Xi, ■ • ■,x„ (the number n of observations is deter- 
mined in advance) is less than or equal to a preassigned constant d, 
and Hq is rejected {Hi is accepted) if x exceeds d. The constant d 
and the fixed number n of observations are to be determined so that 
the test will have the required strength (a, /3). For any given n and d 
the corresponding strength of the test can be determined p follows. 
Since X ^ d is equivalent to the inequality '\/n{x — Oq) ^ '\/n(d — ^o)» 
the probability that x ^ d is the same as the probability that 
Vn(x — do) ^ -s/Tiid — do). The random variable y = — ^o) 

is normally distributed with mean 0 and variance unity if Hq is true. 
Thus, the probability that x ^ d when Hq is true, i.e., the probability 
that we shall accept Hq when Hq is true, is equal to the probability 
that y ^ V^(d - do). We shall denote by G{t) the probability tha^ 


SAVING IN THE NUMBER OF OBSERVATIONS 


55 


a normally distributed random variable vnth mean 0 and 
unity will take a value less than t, i.e., 


(3:61) 


G{t) 



X 


2 


^ dx 


variance 


Then the probability that we shall accept Hq when is true is equal 
to G(-\/n(rf — ^o)]- Since the probability that we shall accept when 
i/o is true is 1 — ot by definition, we have 


(3:62) 0[Vn(d - Oq)] - 1 - a 


To determine the value of 13 corresponding to given n and d, we shall 
write the inequality x ^ d in the equivalent form '\/n(x — 6 i) ^ 
Vn(d — Oi). By definition, 0 is the probability that we shall accept 
//o when Hi is true. But the latter probability is the same as the 
probability that x ^ d, i.e., that -s/n(x - Oi) g v^(d - when 

is true. But when is true this probability is equal to G[v^(d — ffj]. 

Thus, we have 

(3:63) G[\/n(d — ^i)] = ^ 

Hence, to obtain a test of the required strength (a, p), we have to 

choose the quantities n and d so that equations (3 :62) and (3 :63) are 

fulfilled. Let Xq be the value for which C?(Xo) = 1 — a and let Xi be 

the value for which <7(Xi) = /3. The values Xo and X^ can be obtained 

horn a table of the normal distribution. Then equations (3:62) and 
(3 :63) can be written as 

(3:64) - So) = Xo 

and 


(3 :65) 



Subtracting equation (3:64) from equation (3:65) we obtain 

(3:60) V7i{eo - e,) = X, - Xo 

Thus, 

(3:67) « = „(a, <3) = 

(9o - e.)“ 

If this expression is not an integer, «(«, 0 ) is the smallest integer in 


I K expected number of observations re- 

seciuential probability ratio test of strength («, 0 ) and 

ouired'h “• number 7i(a, 0 ) of observations re- 

q d by the current test as given in formula (3:67). In the -sequen- 


56 


THE SEQUENTIAL PROBABILITY RATIO TEST 


tial test we shall use the approximation formulas for A and i.e., 
we shall let A and B equal (1 — /3)/a and (3/(1 — a), respectively, in- 
stead of the exact values A(cx, and B(oc, respectively. It has 
been shown in Section 3.2 that (1 — ^)/« ^ A(cx, /3) and ^/(l — a) 
^ B(cty (3). Thus, by letting ^ = (1 — i3)/o: and B = {I — or) in- 
stead of using the exact values .4 (or, j3) and B(oiy (3), we can only in- 
crease the number of observations required by the sequential test. 
Consequently, the saving effected by the sequential test of strength 
(oc, j3) as compared with the current test cannot be smaller than the 
saving which results from the sequential test obtained by using the 
approximation formulas .A = (1 — 0)/oi and B = ^/(l — «)• 

We shall assume that | — ©o | is small so that the approximation 

formula (3:57) for the expected value of n can be used. Since L{$o) 
= \ — oi and L{d{) — /3, we obtain from (3:57) 


( 3 : 68 ) 

and 

(3:G9) 


Exin) = 


Eoin) = 


j3 log g + (1 - ff) log A 

Eiiz) 

(1 — a) log g + Of log A 

Eoiz) 


where Eiin) denotes the expected value of n when H is true (z — 0, 1) 
As can easily be verified, 


(3 :70) 


Eiiz) = 

iieo - 

Bi)^ 


and 






(3:71) 


Eoiz) = 

— 2(^0 

- 0i)^ 


From 

(3:G7), (3;G8), 

(3:G9), (3: 

70), and 

(3:71) we obtain 

(3 :72) 

Ex(n) 
nia, /3) 

2 

(Xi — ^o)^ 

[j8 log g 

+ (1 - 

A) log A\ 

and 

11 





(3:73) 

2 

(Xi — Xo)^ 

[-(1 - 

a) log g 

— ct log A] 




Exin) 

Eoin) 

It is 

! interesting to 

note that 

the ratios 

nia, (3) 

and 

n(a, (3) 


are inde- 


pendent of the parameter values 0o and d^. The average saving of the 

f Eiin) 1 

sequential test as compared with the current test is 100 |^1 - 

I- TP /^\ n 


per cent if II, is true, and 100 [l - ^^nt if Ho is true. 



SAVING IN THE NUMBER OF OBSERVATIONS 


57 


r 1 

1 I , and panel B 

L n(a, /3)J 

shows the value of 100 1 I , for several values of <x and 0. 

L n(oc,0)J 


In Table 1, panel A shows the value of 100 

(n) 


Because of the symmetry of the normal distribution, panel B is ob- 
tained from panel A simply by interchanging <x and 0. 


TABLE 1 

Average Percentage Saving in Size of Sample with Sequential Analysis, 
AS Compared with Current Most Powerful Test for Testing Mean 

OF A Normally Distributed Variate 


A. When alternative hypothesis is true: 









.01 

.02 

.03 

.04 

.05 

.01 

58 

60 

61 

62 

63 

.02 

54 

56 

57 

58 

59 

.03 

51 

53 

54 

55 

55 

.04 

49 

50 

51 

52 

53 

.05 

47 

49 

60 

50 

51 

B. 

W'hen : 

null hypothesis 

i is true 









.01 

.02 

.03 

.04 

.05 

.01 

58 

54 

51 

49 

47 

.02 

60 

50 

53 

50 

49 

.03 

61 

57 

54 

51 

50 

.04 

02 

58 

55 

52 

50 

.05 

63 

59 

55 

53 

51 


As the table shows, for the range of a and 0 from .01 to .05 (the 
range most frequently employed), the sequential test results in an aver- 
age saving of at least 47 per cent in the necessary number of observa- 
tions as compared with the current test. The true saving is slightly 
higher than shown in the table, since Ei(n) (i = 0, 1) calculated under 
the condition that A = (I ~ 0 )/^ and B = 0/{l - «) is greater than 
Ldn) calculated under the condition that A = A{ce, 0) and B = 0). 


68 THE SEQUENTIAL PROBABILITY RATIO TEST 

3.7 Lower Limit of the Probability That the Sequential Test Will 
Terminate with a Number of Trials Less Than or Equal to a 
Given Number 


In Section A. 6 an approximate formula for the probability distri- 
bution of the number of observations required by the sequential test 

^ A ^ ^ 


is derived in the case in which z = log 


fix, dx) 


is normally distributed. 


fix, Br,) 

It is pointed out that the same distribution function of n can be 
regarded as an approximation to the exact distribution even when z 
is not normally distributed, provided that the absolute value of Eiz) 
and the standard deviation of z are sufficiently small as compared with 
log A and log B. Although the distribution of n given in Section A.6 
could be used to determine the probability that n ^ no for any fixed 
integer no, we shall prefer to derive a lower lunit for this probability 
by a different method for the following reasons. (1) The computation 
of the lower limit given in this section is very simple, whereas the use 
of the distribution function given in Section A.6 would require labo- 
rious computations, since that distribution function has not yet been 
tabulated. (2) If no is fairly large and if a and 0 are small, as they 
usually are in practice, the lower bound given in this section will be 

fairly near the exact value. 

For any given positive integer let P,(n ^ no) denote the probability 
that n. ^ no when Hi is true, i.e., when 6 = (i = 0. 1)-^* want 

to derive a lower bound for P.(n ^ no). It will be assumed that no is 

sufficiently large so that the sum Zj H h Sno may be regarded as 

normally distributed even when the distribution of z is not normal- 


no 


If 



^ log A, then we certainly have n ^ no- Similarly, if 


no 

I 


(3 :74) 

and 

(3:75) 


^ log B, we must have n S ^o- Hence 


no 


Pi( ^ log A) ^Pi(n^no) 


a 


= 1 


no 


Po( 2a ^ Jog B) ^ Poin s no) 


a 


^ 1 


»See formulas (A:166), (A:183) and (A;194). 

« In general, for any relation R wc use the symbol Pi(R) to denote the probability 

that R holds when Hi is true. , 

>3 According to well-known theorems in the theory of probability, the sum of 
large number of independent random variables is nearly normally distributed under 

very general conditions. 



NUMBER OF TRIALS BELOW A GIVEN NUMBER 


69 


£ 


(3 :76) 


£ 


'Q 


n^Ex (z) 


■\/n^<7i (z) 


log A — npEx (g) 

(z) 


where «ti( 2 ) denotes the standard deviation of z when Hx is true. The 
left-hand member of (3:76) is normally distributed with mean 0 and 
variance unity when is true. For any value X we shall denote by 
G{\) the probability that a normally distributed random variable with 
mean 0 and variance unity will take a value less than X. Thus, the 
probability that such a random variable takes a value ^ X is given by 

1 — G(X). Hence the probability that (3:76) holds when Hx is true is 
equal to 1 — C?{Xi(no)] where 

log^ — noExiz) 


(3 :77) 


Xi(no) — 




But the probability that (3:76) holds when Hx is true is equal to 
P,(22„ ^ log A). Thus, 


(3:78) 


Z 

a M 1 


Because of (3:74), we obtain 

1 — G[Xi(no)I ^ Pi(n ^ Uq) 

Thus, 1 — G[Xi(7io)] is a lower limit of the probability that n ^ uq 
when //j is true. ^ 

^To obtain a lower limit for Po(n ^ n^), we rewrite the inequality 
^ log B in the form 


1 


(3:79) 


no 

^ ‘^Qpoiz') 


a-= \ 




log B — noEo(z) 

- ^o(no), say 


where 0 - 0 ( 2 ) denotes the standard deviation of z when Hq is true. Since 
the left-hand member of (3:79) is normally distributed with mean 0 
an v^iance unity when Hq is true, the probability that (3:79) holds 
when Ho is true is equal to G[Xo(no)]. Hence, 

no 

Poi^Zc SlogB) - G[Xo(no)] 

a * 1 


(3:80) 


60 


THE SEQUENTIAL PROBABILITY RATIO TEST 


Because of (3:76), we then have 

(3:81) ^?[Xo(^o)] = = ^o) 

Thus <j[Xo(no)] is a lower bound of the probability that n ^ ?Zo when 
Ho is true. 

When logX = log (1 — (3) /a. and log S = log/3/(l — a). Table 2 
shows the values of the lower bounds of Po(ji ^ tio) and Pi(n ^ no) 
corresponding to different pairs (a, /3) and different values of no- In 
these calculations it has been assumed that the distribution under Hq 
is a normal distribution with mean 0 and variance unity, and the dis- 
tribution under Hi is a normal distribution with mean 6 and variance 
unity. For each pair (a, /3) the value of 9 has been determined from 
(3 :67) so that the number of observations needed for the current most 
powerful test of strength (or, /3) is equal to 1000. 


TABLE 2 

Lower Bound of the Probability that a Sequential Analysis Will 
Terminate within Various Numbers of Trials, when the Most 
Powerful Current Test Requires Exactly 1000 Trials 


Number 

of 

Trials 

a — .01 and 0 — .01 

Alterna- 

tive 

hypothesis 

true 

Null 

hypothesis 

true 

1000 

.910 

.910 

1200 

.950 

.950 

1400 

. 972 

.972 

1600 

.985 

.985 

1800 

.991 

.991 

2000 

.995 

.995 

2200 

.997 

. 997 

2400 

.999 

.999 

2600 

.999 

. 999 

2800 

1 .00 

1.00 

3000 

1.00 

1.00 


a = .01 and /3 = .05 


Alterna- 

tive 

hypothesis 

true 


.799 

.871 

.916 

.946 

.965 

.977 

.985 

.990 

.994 

.996 

.997 


Null 

hypothesis 

true 


.891 
.932 
.957 
.972 
.982 
. 989 
.993 
.995 
.997 
.998 
.999 


a = .05 and 0 = .05 


Alterna- 

tive 

hypothesis 

true 


773 

837 

883 

915 

938 

955 

967 

976 

982 

987 

990 


Null 

hypothesis 

true 


.773 

.837 

.883 

.915 

.938 

.955 

.967 

.976 

.982 

.987 

.990 


The probabilities given are lower oounus lui me- nuc 
to a test of the mean of a normally distributed variate, the difference between the 
null and alternative hypothesis being adjusted for each pair of values of at and ^ 
so that the number of trials required under the most powerful current test is exactly 

1000 . 



TRUNCATION OF THE SEQUENTIAL TEST PROCEDURE 


61 


3.8 Truncation of the Sequential Test Procedure 

Although it is sho^vn in Section A.l that the probability is 1 that 
the sequential test procedure will eventually terminate, it is occasion- 
ally desirable to set a definite upper limit, say no, for the number of 
observations. This can be achieved by truncating the sequential proc- 
ess at n = no, i.e., by giving a new rule for the acceptance or rejection 
of Ho at the Tioth trial if the sequential process did not lead to a final 
decision for n ^ no- A simple and reasonable rule for truncation at 
the noth trial seems to be the following : If the sequential probability 
ratio test does not lead to a final decision for n ^ no, accept Hq at the 

"0 ^ no 

noth trial when log B < y Zg ^ 0, and reject Hq when O < ^ < 

log A. 

By truncating the sequential process at the noth trial we shall, how- 
ever, change the probabilities of errors of the first and second kinds. 
Let a and /3 be the probabilities of errors of the first and second kinds 
if the sequential test is not truncated. The effect of the truncation 
on a and /3 will, of course, depend on the value of no- The larger no, 
the smaller will be the effect of truncation on a and 0. We shall denote 
the resulting probabilities of errors of the first and second kinds by 
Qf(no) and ^(no), respectively, if the sequential process is truncated at 
n = no- In this section we shall derive upper bounds for a(no) and 
0{no). 

obtain an upper bound for Qf(no) we have to consider the cases 
in which the truncated process leads to the rejection of Hq, while the 
non-truncated process leads to the acceptance of Hq. Denote by 
Poino) the probability under Hq of obtaining a sample such that the 
truncated process leads to the rejection of Ho, while the non-truncated 
process leads to the acceptance of //q. Then, we clearly have 

^ a + po(no) 

The reason that in (3:82) the inequality sign holds instead of the 
equality sign is that there may be samples for which the truncated 
process leads to the acceptance of Ho, while the non-truncated proce.ss 
leads to the rejection of Ho- To obtain an upper bound for a(no), we 
merely need to derive an upper bound for Po(no). Hv definition. 
Po(no) IS the probability under Ho that for the .successive observations 

fulfilled ' following three conditions are simultaneously 


n 


(0 


log B < < log .-1 for rt = - ^ _ 1 


62 


THE SEQUENTIAL PROBABILITY RATIO TEST 


no 

(ii) 0 < < log A 

a — I 

{Hi) When the sequential process is continued beyond n©, it termi- 
nates with the acceptance of Hq. 


Denote by poirto) the probability under Ho that condition {ii) will 
be fulfilled, i.e., 

no 

(3:83) Po(no) = ■F’o(0< < log A) 


Since the probability that condition {ii) is fulfilled cannot be smaller 
than the probability that all three conditions are fulfilled simultane- 
ously, we have 

Po(^o) ^ Po(wo) 


and, therefore, 


(3:84) of(^o) ^ Po(^o) 

Thus, a + po(wo) is an upper bound for ot{no)y which can easily be 
computed, as will be shown later. To obtain an upper bound for 
0{no) we shall denote by Pi(no) the probabUity (under Hi) that the 
successive observations will be such that the truncated process leads 
to the acceptance of Hq, Avhile the non-truncated process leads to the 
rejection of Hq. In other words, pi (no) is the probability under Hi 
that the successive observations will satisfy the following three condi- 
tions simultaneously: 



log B < 



< log A 


for n = 1, ' • *, no 


1 


{ii) log B < ^ 0 

a ^ 1 

{Hi) If the process is continued beyond the noth trial, it terminates 
with the acceptance of Hi. 

Clearly 

(3:85) i3(no) ^ ^ + Pi (no) 

Since it is difficult to determine the value of Pi(no), we shall derive 
a simple upper bound for it. Let pi(no) be the probability under Hi 

that condition {H) is fulfilled, i.e., 

no 

Pi (no) = Hi (log B < ^ '] ^ac = 0) 

< 1 » 1 


( 3 : 86 ) 



TRUNCATION OF THE SEQUENTIAL, TEST PROCEDURE 


C3 


Then pi (no) ^ Pi(no) and we have 


(3 :87) 


^(no) ^ + Pi (no) 


We shall now show how po(no) and pi(no) can be computed. We 
shall assume that no is sufficiently large so that Zi -{-•••-{- Zn^ may be 
regarded as a normally distributed variable. When Hi is true (t = 
0, 1) the expected value of Zi + • - • + Zno is equal to noEi(z') and the 
standard deviation of Zi -!--••+ Zn^ is equal to '\/n^cr,(z) where cri(z) 
denotes the standard deviation of z when Hi is true. To compute 


ing form: 

( 3 : 88 ) - 

Let 
(3:89) 


no 

Z 

O! S 1 


’n.QEoiz') ^ Zi -j- 


no<To(z) 


+ Zno — nogo(e) ^ A — noEoiz) 


noCQ{z) 


y/^<To{z) 


V, = and .,2 = ^ “ «n^'„(e) 


\/~^Oo{z) 


y/n'o^oi.z) 


Since the middle term in (3:88) is normally distributed with zero mean 
and unit variance when Hq is true, the probability that (3:88) is ful- 
filled when Hq is true is equal to - G(*^i) where G{v) denotes the 

probability that a normally distributed variable with mean 0 and vari- 
ance unity will take a value < v. Thus, 


(3:90) 


Po(^o) — G{vq) — G{vi) 


no 


To compute Pi(no), we shall write the inequality log B < ^ ^ 0 

in the following form: 


a “ I 


(3 ;91) 
Let 
(3 :92) 


lo^fi — no^i(e) ^ zi + 


■\/no<ri(z) 


t'a = 


log B — npEi (z) 
■x/ nociiz) 


• + gno — UpEiiz) ^ —ripEiiz) 
A/^<ri(e) = \/^o-i(2) 

and 

“V noai(z) 


Since the middle term in (3:91) is normally distributed with mean 0 

« 0 , 7 ^ P^-obability (under H,) that 

(3:91) holds is equal to G(i' 4 ) — Giu^). Hence, 


( 3 : 93 ) 


Pl(Wo) = G(p 4) — Gil's) 


64 


THE SEQUENTIAL PROBABILITY RATIO TEST 


Our results can thus be summarized as follows: 

(3:94) a(no) ^ + G(u 2 ) - 

and 

(3:95) Pino) ^ P GM - GM 

where vi, V 2 , vsj and are given in (3:89) and (3:92). These upper 
bounds may considerably exceed ocino) and Pino), respectively. It 
would be desirable to find closer limits. 

Table 3 shows the values of the upper bounds of ccino) and Pino) 
given in (3 :94) and (3 :95) corresponding to different pairs (a, p) and 
different values of no. In these calculations we have put log A = 

TABLE 3 


Effect on Risks of Error of Truncating a Seqttential Analysis 

AT A Predetermined Number of Trials 



oc = .01 and 0 ~ .01 

a = .01 and 0 = .05 

a — .05 and 0 — .05 

Number 

of 

Upper 

Upper 

Upper 

Upper 

Upper 

Upper 

Trials 

bound of 

bound of 

bound of 

bound of 

bound of 

bound of 


effective 

effective 

effective 

effective 

effective 

effective 


or 

0 

OC 

0 

ct 

0 

1000 

.020 

.020 

.033 

.070 

.095 

.095 

1200 

.015 

.015 

.024 

.063 

.082 

.082 

1400 

.013 

.013 

.019 

.058 

.072 

.072 

1600 

.012 

.012 

.016 

.055 

.066 

.066 

1800 

.011 

.011 

.014 

.053 

.062 

.062 

2000 

.010 

.010 

.012 

.052 

.058 

.058 

2200 

.010 

.010 

.012 

.051 

.056 

.056 

2400 

.010 

.010 

.011 

.051 

.055 

.055 

2600 

.010 

.010 

.011 

.051 

.053 

.053 

2800 

.010 

.010 

.010 

.050 

.053 

.053 

3000 

.010 

.010 

.010 

.050 

1 

1 

,052 

.052 


If the sequential analysis is based on the values « and 0 shown, but a decision 
is made at tiq trials even when the normal sequential criteria would require a con- 
tinuation of the process, the realized values Qt(no) and 0ino) will not exceed the 
tabular entries. The table relates to a test of the mean of a normally distributed 
variate, the difference betw'een the nuU and alternative hypotheses being adjusted 
for each pair (a, 0) so that the number of trials required by the current test is 1000. 


INCREASE IN EXPECTED NUMBER OF OBSERVATIONS 


65 


log (1 — /?)/o: and log 5 = log/3/(l — o:), and assumed that the dis- 
tribution under Hq is normal with mean 0 and variance unity, and the 
distribution under Hi is normal with mean B and variance unity. For 
each pair (a, /3) the value of 6 has been determined so that the number 
of observations required by the current most powerful test of strength 
{at, /3) is equal to 1000. 

It seems to the author that the upper limits given in (3;94) and 
(3.95) are considerably above the true of(no) and respectively, 

when no is not much higher than the value of n needed for the current 
most powerful test. 

3.9 Increase in the Expected Number of Observations Caused by 
Replacing the Exact Values A{q., p) and S(a, p) by (1 — p)/a 
P/(l — O'), Respectively 

The quantities A(a, 0) and B(oc, (3) denote the values of A and 13 
for which the probabilities of errors of the first and second kinds asso- 
ciated with the sequential probability ratio test are exactly a anti /3, 
respectively. In Section 3.3 it has been recommended that A((x, 
and be replaced by a(c., ^) = (i - and 0) = 

^/(l - a), respectively. This may slightly increase the expected num- 
ber of observations, since a(oc, 0) ^ A (a, (3) and 6(a, /?) ^ B{a, (3).^-* 

The present section gives estimates of the amount of such increase in 
the expected number of observations. 

In Section 3.5 the following approximation formula has been ob- 
tained for the expected number of observations: 


(3 :96) 


Eein) 


L(e) log g H- [1 - L(0)] log ,4 

Eeiz) 


Since Lido) - 1 - a and LiSi) = {3, we obtain from (3:96) 


(3 :97) 
and 
(3 :98) 


Eoin) 


(1 — «) log B -b oc log A 

Eoiz) 


Ei(n) . + (1 - log^ 


Eiiz) 


Eiin) denotes the expected value of n when is tr 
“See inequalities (3:21) and (3:22). 


ue 


66 


THE SEQUENTIAL PROBABILITY RATIO TEST 


Thus, the changes ^Eo{ri) and in the expected values Eoin) 

and £^i(n) caused by using a{cn, 0) and b{a, 0) instead of A{<x, 0) and 
B{c(, 0), respectively, are given by 


(3:99) ASo(n) 


{ 


and 


(3:100) AEiin) 


(1 — a) [log Z>(o', 0) — log B{ay /9)] + 1 

a[log a{ot, 0) — log A (a, ^3)] ^ 

6(«, 0) a{a, 0 

(1 — a) log 17 + a log 


B{c£, 0 


A{oc, /3) 


Eo{z> 

r og 6(a, 0) — log B{oi, 0)] + 

(1 — i3)[log a(a, 0) — log A{oiy /3)] 


} 


Hoc, 0 ^ , a(oc, 0 

13 log + (I — /9) log 


B(a, (3) 


A (a, 0 


Ei(z) 


Formulas (3:99) and (3:100) are, of course, approximation formulas, 
since (3:97) and (3:98) are approximations. However, if the error in 
the formulas (3:97) and (3:98), i.e., if the differences 


(3:101a) 

and 

(3:101&) 


Eoin) 


(1 — oc) log B oc log A 

Eo(z) 


EiM 


log g + (1 - 0} log A 

E,{z) 


were exactly independent of the quantities A and B, then in (3 .99) and 
(3:100) the equality sign would hold exactly. It can be shown that 
small changes in A and B affect the differences (3:101) exceedingly 
little, and, therefore, (3:99) and (3:100) are very close approximations. 

We shall derive upper bounds for the right-hand members of (3 :99) 

Hot,0) 15 , , a{oc,0) 

and(3:100). Since ^0(2) and log are negative, while log — 

is positive, we have 

It is remarked at the end of Section A. 2.1 that E(e) and a certain quantity ho 
defined there have oppo.'iite signs. Since /jq — 1 if Hq is true, and Ao = —1 if is 
true, it follows that Eo{z) < 0 and > O. 


INCREASE IN EXPECTED NUMBER OF OBSERVATIONS 


67 


(1 — a) log 


( 3 : 102 ) 


B{a, ff) 


oc 


log 


A {a, 0) 


(I — O') log 


Eo(z) 


< 


0) 

B((x, 0) 


< 


Eoiz') 
h{oc, 0) 


Eo(z) 


log 


Similarly, since {z) and log * are positive, while log 


B{a, 0) 

b((x, 0) 


is negative, we have 


A (o:, 0) 


B(a, 0) 


b(oC, 0) ClCcXy 0) 

log 77 -f- (1 — 0) log 


( 3 : 103 ) 


/?) 


A(c., 0) 


(1 - 0) log 


Bi{z) 


< 


0 ) 

A (a, 0) 


< 


1 


(^) 

<2(0, 0 ) 


log 


Eliz) °A(<x,0) 


Thus, for all practical purposes log is an upper bound for 

1 b{a 0) Aicx,0) 

AE, (n) and log i« an upper bound for AEo(n). The exact 

values A (a, 0) and B{oc, 0) not being known, wo cannot yet use these 

limits. Since £■, (j) > 0, an upper limit of log is obtained 

Ei{z) A (a, 0) 

by substituting for - an upper bound of - . Similarly, 


A (or, 0) 


A {oc, 0) 


since < 0, an upper limit of log can be obtained by 

i^oKZ) B{oe, 0) 

Hoc, 0) 


substituting for — - a lower bound of 

E(oc, 0) B{oc, 0) 

From equations (A:29) and (A:30) in the Appendix one can derive 
the lollowing inequalities: 

(3:104) < e 

A 

and 

(3:105) > 

B(«, « - 

where the <niantitie.s 6, and ,,, are defined by equal ion.s (.\ :27) and 


A {oi, 0) 
b{a, 0) 


68 


THE SEQUENTIAL PROBABILITY RATIO TEST 


(A: 28 ).‘® The quantities Sq and 170 have been explicitly computed for 
binomial and normal distributions. 

Thus, we arrive at the following result: For all practical purposes 
we may regard (log 50J/£^l(^) as an upper bound for AEx{n) and 
(log riQ^/Eoiz) as an upper bound for AEq{ 7 i). 

TABLE 4 

Incrbasb in Expected Number op Observations Resudtino from 
Approximations in Criteria for Terminating a Sequential 

Process 


Number of 
Observations Needed 
for the Current 
Most Powerful Test 

pH ^ 

0 0 
• • 

II II 

« = .01 
d = .05 

« = .05 

0 = .06 

10 

1.1 

1.3 

1.6 

30 i 

1.9 

2.2 

2.7 

100 

3.4 

4.0 

4.9 

200 

4.9 

5.7 

6.9 

500 

7.7 

9.0 

10.9 

1000 

1 

10.8 

12.7 

15.4 


The tabular entries may, for practical purposes, be treated as upper bounds of 
the exact increases. The table relates to a test of the mean of a normaUy distributed 
variate, the difference between the nuU and alternative hypotheses being adjusted 
for each pair of values of « and d so that the number of trials required under the 
best current test is as shown in the left-hand column. 

i“This can be seen as follows: Substituting Aioc, 0) for A, B(«, /3) for B, and ffo 
for d, we obtain from (A:29) and (A:30) 

IS(«, ^ 

^ [A(a. 

Since we let A = 0} and B = B(a, 0), we have L(0o) = I — a and L(di) = 0. 

It follows from this and the two equations which are obtained from (A:18) by 
substituting 6q and 61 for d that 

Eoa* = ^ 0) and ^ ^ ^ = a(«, (3) 

Since = 1» ^’<3 obtain 

Bict, 0 ) 7 iBo ^ 2>(«. 0 ) and a(a, 0 ) ^ A (at, d^Sg^ 
from which (3:104) and (3:105) follow. 


INCREASE IN EXPECTED NUMBER OF OBSERVATIONS 69 

As an example, consider the case in which the distribution under 
Hq is normal with zero mean and unit variance, and the distribution 
under H\ is normal with mean $ and variance unity. Since for the 
normal distribution Tie = l/de [see equation (A:5I)] and —£’ 0 ( 2 ) = 
El ( 2 ), the upper boimd of AEo(n) is the same as the upper bound of 
AEi{n). This upper bound depends only on the value of 6. For any 
pair (a, /3) and for any positive integer m there exists exactly one value 
of 6 such that m observations are needed for the current most powerful 
test of strength (a, /3). Thus, with each integer m and pair (a, P) 
there is associated exactly one value of 6. Table 4 shows the common 
upper bound of AEo(n) and AEi(n) calculated for values of 0 corre- 
sponding to different pair« (a, /3) and integers m. 


Chapter 4, OUTLINE OF A THEORY OF SEQUENTIAL TESTS 
OF SIMPLE AND COMPOSITE HYPOTHESES AGAINST A SET 

OF ALTERNATIVES 


In Chapter 3 we were concerned mainly with the theoretical case of 
testing a simple hypothesis Hq against a single alternative Hi. In 
problems arising in applications, the unknown parameter, or param- 
eters, can usually take infinitely many values. In this chapter we 
shall discuss sequential tests of simple and composite hypotheses 
against infinitely many alternatives. 

4.1 Tests of Simple Hypotheses 
4.1.1 Introductory Remarks 

A simple hypothesis has been defined as a statement which specifies 
completely the values of all the unknown parameters. We should like 
to make some remarks concerning the conditions under which a test 
of a simple hypotliesi.s is meaningful and appropriate. For this pur- 
pose it will be sufficient to consider the case in which there is only one 
unknown parameter $ involved in the distribution of the random vari- 
able X under consideration. A simj^le hypothesis is then a statement 
that Q is equal to .some specified value 00 - 

In ai>plications the problem of testing a hypothesis usually arises 
a.s follows: '^rhere arc two alternative courses of action, say action 1 
and action 2, between which a decision is to be made, and the prefer- 
ence for one or the other action depen<ls on the value of the parameter 
0. I.et cj denote tlic .sot of all values of 0 for which action 1 is preferred 
to action 2; tlien action 2 is preferred to action 1 for all values 0 out- 
side Lot be the hypothesis that 0 is contained in cj. Then the 

Ijroblcin of deciding between the two cour.se.s of action can be formu- 
lated as the piobltan of testing the hypothesis If is accepted 

we take action 1 and if 11^ is rejected we take action 2. If the degree 
of preference for one or the other action varies continuously with the 
value of 0, the set w cannot consist of a single value 00- In fact, if w 
were to contain only the single value 0o, it would mean that we prefer 
action 1 when 0 = 0o and we prefer action 2 for any 0 9 ^ 0o, no matter 

* For values 0 on the boiiiulury of to it will usually be inconsequential which 
action is taken. 


70 



TESTS OF SIMPLE HYPOTHESES 


71 


how near 6 is to Thus, we would have a discontinuity in our prefer- 
ence scale bX 6 = Oq. 

We see that the problem of testing a simple hypothesis arises, strictly 
speaking, only if there is a discontinuity in our preference scale for 
actions 1 and 2. While a discontinuity in the preference scale is, of 
course, possible, it will occur rather seldom. A discontinuity in the 
preference scale may occur, for example, if we want to test the validity 
of some hypothetical scientific theory which implies that the param- 
eter 6 must have a specified value ^o- In such a case any deviation of 
the value of 6 from do, no matter how small, is of importance, since it 
invalidates the hypothetical theory in question. 

Whenever the degree of preference for one or the other action varies 
continuously with the value of 0, the hypothesis to be tested will have 
to be, strictly speaking, a composite one. Nevertheless, frequently it 
will be expedient to approximate the composite hypothesis by a simple 
one, since the latter is usually a simpler problem to treat. As an illus- 
tration, consider the following example: Suppose that the hardness x 


of a material varies from unit to unit and is normally'’ distributed with 
a knowTii variance. The mean value 6 oi x is, however, unkno^vn. Sup- 
pose that ^0 is considered to be the most desirable value of $ and the 
material is considered less desirable the greater | 0 — | • Let action 

1 be acceptance of the material and action 2, rejection of the material. 
Preference for acceptance is strongest when 0 = 0o. The preference 
for acceptance will decrease steadily as | 0 — 0o | increases. There will 
be a positive value 5 such that for \ 0 — 0o | >5 rejection of the mate- 
rial is preferred and the degree of preference for rejection increases 
with increasing value of | 0 — I in the domain [ 0 — 0o 1 > 5. If 
I ^ “ ^0 I = 5, i.e., if the quality of the product is just on the margin, 
neither action is preferable to the other. In such a situation the proper 
hypothesis to be tested is the composite hypotliesis that 0 — | = 

However, if 5 is small, the composite hypothesis may be replaced for 
practical purposes by the simple hypothesis that 0 = Oq. The test of 
the hypothesis that 0 = 0o will have nearly the same operating charac- 
teristic function as the test of the hypothesis that \ 0 — I = ^> for 
the following reasons. To test the hypothesis that [ 0 — 0o | ^5 we 
subdivide the 0-axis into three zones: zone of pieference for accej^tance, 
zone of preference for rejection, and zone of indifference. As explained 
m Section 2.3.1, the zone of preference for acceptance consists of all 
vaues 0 for which acceptance is strongly preferred, i.e., for which the 
^jection of the material is considered an error of practical importance. 
‘ imilarly, the zone of preference for rejection consists of all tliose values 
or \\hich rejection is strongly preferred, whereas for values 0 in the 


72 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


indiflerent zone the preference for one action over the other is only 
slight and we do not care particularly which action is taken. In our 
example the three zones may reasonably be defined as follows. We 
select two positive values Sq <. S and Si > 8. The zone of preference 
for acceptance is given by | 0 — | ^5©, the zone of preference for 

rejection by | ^ — 1 ^ and the zone of indifference by 5© < 

\ 0 — 1 < 5i. The test procedure will then be constructed so that 

the probability of rejection will not exceed a preassigned value a when- 
ever 0 is in the zone of preference for acceptance, and the probability 
of acceptance will not exceed a preassigned value whenever 0 is in 
the zone of preference for rejection.- Now if we replace the original 
composite hypothesis by the simple hypothesis that $ = 6q, the zone 
of preference for acceptance will consist of the single value 0 = Oq. 
The zone of preference for rejection may be defined, as before, by 
^ — ^0 I ^ ^ 1 - The zone of indifference is then given by 0 < | ^ — | 

< 5i. The test procedure for testing that 6 = do will then satisfy the 
requirement that the probability of rejecting the hypothesis is <x when 
0 = 0© and the probability of accepting the hypothesis does not exceed 
/3 whenever [ 0 — 0© | ^ Si- If 5© is very small, the test of the hypoth- 
esis that 0 = 0© will satisfy the requirements imposed on the test of 
the original composite hypothesis with close approximation, since the 
probability of rejecting the hypothesis will be nearly equal to ot for 
values 0 in a sufficiently small neighborhood of 0©. Thus, for practical 
purposes we may replace the original composite hypothesis by the 
simple hypothesis that 0 = 0©. 

As we have seen, a tost of a simple hypothesis will occur in applica- 
tions in two cases: (1) when there is a discontinuity in the preference 
scale and the problem calls for testing a simple hypothesis in the strict 
sense (these cases are rare) ; (2) when the problem is such that it calls 
for testing a composite hypothesis and it is approximated by a simple 
hypothesis merely for the sake of simplicity. 

In terms of the zones of preference for acceptance, of preference for 
rejection, and of indifference, the simple hypothesis may be character- 
ized by the condition that the zone of preference for acceptance con- 
sists of a single point. 

4.1.2 Test of a Simple Hypothesis against One-Sided Alternatives 

We shall discuss here the simple case in which there is only one un- 
known parameter 0 and the hypothesis that 0 = 0© is tested against 
alternative values of 0 which lie on one side of 0©, say > 0©. In other 
words, only values of 0 > 0© are considered admissible alternatives to 

2 In this connection see Section 2.3.2. 



TESTS OF SIMPLE HYPOTHESES 


73 


the hypothesis to be tested. In this case the zone of preference for 
acceptance consists of the single value ^o- The degree of preference 
for rejection of the hypothesis will generally increase with increasing 
value of 0 in the domain 0 > Oq. It will, therefore, be possible to find 
a value di > such that the acceptance of the hypothesis is con- 
sidered an error of practical importance whenever 0 ^ while for 
values 0 > 00 but < 0i the acceptance of the hypothesis is an error of 
no particular practical consequence. Thus, the zone of preference for 
rejection may be defined by 0 ^ 0i, and the zone of indifference by 
$0 < 0 < 01 . 

According to Section 2.3.2 we shall impose the following require- 
ments on the OC function of the test. The probability that the hy- 
pothesis will be rejected should be equal to a preassigned value <x when 
0 = 00- The probability of accepting the hypothesis should not exceed 
a preassigned value whenever 0 ^ 0i . 

In most of the important cases occurring in practice, such as when 
z has a normal, binomial, or Poisson distribution, and so on, the se- 
quential probability ratio test of strength (a, /?) for testing the hy- 
pothesis that 0 = 00 against the single alternative 0i will satisfy the 
imposed requirements, since the probability of an error of the second 
kind will decrease steadily ^^^th increasing values of 0 in the domain 
0 ^ 01- Thus, in all these cases the sequential probability ratio test 
for testing the hj'^pothesis that 0 = 0o against a properly chosen alter- 
native 01 provides a satisfactory solution to our problem. 

The case in which the alternative values of 0 are restricted to values 
0 < 00 instead of values > 0o is entirely analogous and need not be 
discussed separately. 

4.1.3 Test of a Simple Hypothesis with No Restrictions on the 
Alternative Values of the Unknown Parameters 

In this section we shall deal with the following general problem: The 

distribution of x involves fc unkno\^’n parameters 0i, • • • , 0* and the 

hypothesis IIq to be tested is that 0i, • • • , 0fc are equal to some 

specified values 0i*^, - • - , 0^.®, respectively. The set of k parameters 

(01, ■ • - , 0fc) will be denoted by 0 without any subscript and will 

be referred to as a parameter point. The use of a superscript to the 

letter 0, such as 0® or 0^, etc., will indicate that a particular parameter 

point is meant. Our hypothesis Hq can thus be expressed by stating 

that the unknown parameter point 0 is equal to the particular param- 
eter point 0°. 

As we have seen in the preceding section, the zone of preference for 
acceptance consists of the single parameter point 0^. Denote the zone 


74 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


of preference for rejection by o^r- This will usually be the set of all 
points 0 whose “distance** (defined in some sense) from 0^ is greater 
than or equal to some given positive value. The requirements imposed 
on the OC function of the test, as formulated in Section 2.3.2, can then 
be stated as follows: The probability that i/o will be rejected when 
0 = 0 ° should be equal to a preassigned value a and the probability 
that Ho will be accepted should not exceed a preassigned value /3 for 
any parameter point 0 in the zone cor. 

Before we discuss the problem of constructing a proper sequential 
test satisfying the above requirements, we shall consider the problem 
of finding a proper test procedure satisfying the following modified 
requirements. For any ^ in let ^(0) denote the probability that Hq 
will be accepted when 0 is the true parameter point. Thus /3(0) is the 
probability of an error of the second kind when 0 is true. Our original 
requirement was that 0(0) should not exceed a preassigned value 0 for 
all 0 in u>r. Instead we shall now require that the weighted average of 
0(0), weighted with a given weight function iv(0), should be equal to 
0, i.e., 

(4:1) r /3(ff)w(e) da = p 


where w(0) ^ 0 for all 0 in Wr and ® 


(4:2) 



The requirement that the probability of rejecting Hq when Ho is true 
be equal to a preassigned a is maintained as before. A proper sequen- 
tial test procedure satisfying these modified requirements can easily 
be constructed. Let pon be the probability distribution of the sample 
(xiy • * Xn) when Hq is true, i.e., 


(4:3) pon = /(xi, 


• • • , 0fc®)/(:c2, 01®, • • • , 0fc®) • * * /(^ 


n> 


0 ,® 


, 0 .®) 


Furthermore, let pin be defined by 


(4:4) Pin = I /(a^i> * ' *» ‘ ^k)^iG) d0 

Thus, Pi « is a weighted average of the probability distribution func- 
tions’ /(xi, 01 , ■ - *, 0Jt) • • ■ /(^«, 01 , • • •, 0fc) corresponding to various 
parameter points 0 in oj,.. As such, pm itself is a probability distnbu- 

3 The weight function u»(0) may also be discrete. A single formula valid for both, 
continuous and discrete, w'eight functions could be given by using Stieltje’s integrals 
in (4:1) and (4:2). 



TESTS OF SIMPLE HYPOTHESES 


75 


tion function of the sample (xi, • • •, Xn)* Let Hi denote the hypoth- 
esis that the distribution of the sample (xi, • • x^) is given by 

defined in (4:4). Then Hi is a simple hypothesis, since it specifies 
completely the distribution. Consider the sequential probability ratio 
test of strength (a, 0) for testing Hq against the simple alternative 
hypothesis Hi. This procedure is given as follows. Reject Hq if 


(4:5) 

accept Ho if 
(4:6) 


Pin 

POn 


^ A 



and take an additional observation if 


(4:7) B < — < A 

POn 

The expressions pon and pm are given by (4:3) and (4:4), respectively, 
and the constants A and li are to be chosen so that the test will have 
the required strength (a, 0). As we have seen in Section 3.3, for most 
practical purposes we may use the approximation formulas A = 
(1 - 0)/a and B = 0/(1 - 

The sequential probability ratio test defined by (4:5), (4:6), and 
(4:7) can be shown to satisfy the relation (4:1). Thus, this probability 
ratio test may be regarded as a satisfactory solution to our problem if 
our requirement is that the probability of an error of the first kind 
should be ce and that 0(6) should satisfy (4:1). 

In practical problems, however, it seems more reasonable to main- 
tain the original requirements. That is to say, we shall want a test 
procedure such that the probability 0(6) of accepting Hq does not 
exceed 0 for all parameter points 6 in the zone a>r, and the probability 
is or that we shall reject Hq when 6 = 6^. There arc, in general, infi- 
nitely many sequential tests which satisfy these requirements, and we 
want to select one for which the expected number of observations is 
as small as possible. 

* The distribution of the sample (zi, •••, x„) will be precisely given by if 

've assume that 6 in to, has a probability distribution given bv the density function 
w{d). 

'Although the successive obser\'ations xi, X 2 , •••,etn., arc not independent 
w on H\ is true (pin cannot be roprcscnleil as a j)roduct of n factors wh«;re the atth 
actor depends only on Xa), the results and conclusion in Sections 3.2 and 3.3 
remain valid, as pointed out in Section 3.2. 


76 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


Although a thorough investigation of this problem has not yet been 
made, the following approach may perhaps be reasonable. First we 
restrict ourselves to the class C of sequential probability ratio tests 
based on the ratio Pm/Vonj where pon is given by (4:3) and pin by 
(4:4), corresponding to an arbitrary non-negative weight function wi&) 
satisfying (4:2).® Thus, the class C contains at least as many tests as 
there are possible weight functions wifi) satisfying (4:2). A test in 
class C is uniquely determined by choosing a particular weight func- 
tion w{$) and particular values for A and B. The test procedure is 
then carried out in the usual way. Hq is accepted if pin/pon ^ 

Hq is rejected if Pin/Pon ^ A, and an additional observation is made 
if < Pin/pon < A. The restriction to the class C of sequential tests 
is suggested by the fact that we have been led to these tests by the 
requirement that some weighted average of the probabilities of errors 
of the second kind be equal to a given value /3. 

Accepting the restriction that the sequential test should be a mem- 
ber of the class C, we still need a principle for choosing the weight 
function wfi). Suppose that the quantities A and B have already 
been determined. Let us then examine what would be a reasonable 
choice of wfi). After A and B have been chosen, the probability a of 
making an error of the first kind is also determined for practical pur- 
poses and the choice of wfi) will not affect it."^ Thus, the choice of 
wid) will affect only fiid). A weight function wfi) may be regarded 
the more favorable the smaller the maximum value of 0i6) with respect 
to 6 {6 is, of course, restricted to points in wr). Thus, the follo\ving 
choice of wid) seems reasonable: For given values of A and B the weight 
function wfi) is chosen for which the maximum of /3(5) toilh respect to 6 
(fi restricted to points in a>r) takes its smallest value. When this principle 
for the choice of wfi) is adopted, a. and the maximum of ^{6) with 
respect to B {B in Wr) ^^dll depend only on the quantities A and B. 

« Instead of defining pm by some weighted average of the type given in (4:4), it 
would seem equally reasonable to define pm as the maximum of /(xi, tf) • • ’/(x^, 0) 
with respect to 0 where 6 is restricted to points in a>r. Then the ratio pin/pOn would 
coincide with the so-called likelihood ratio introduced by J. Ncyman and E. Pearson 
and widely u.scd in current test procedures. Our reason for preferring weighted 
averages is that the theory of such tests seems to be considerably simpler. If 
Pm were defined by the maximum with respect to 5 in tor, pin would no longer be a 

probability distribution. _ 

’ In fact, with good approximation the following relations hold: (1 — $)/ct = A 

and 8/(1 — a) = B where ^ = f ^(6)w(e) de. Solving these equations with 

respect to ot and 0 we obtain a = (1 — B)/(A — B) and 8 — \B(A 1)]/(A B). 

Thus, cr and 8 depend only on A and B. 


TESTS OF SIMPLE HYPOTHESES 


77 


These values A and B are then determined so that the probability of 
an error of the fii*st kind has the desired value a and the maximum 
of /3(0) with respect to 0 is equal to the required value /3. 

There is no general method yet available for the determination of 
an optimum weight function w(0) in the sense defined above. For 
some special but important cases» however, such a weight function has 
been determined. This point is discussed in Section A,8. 


4.1.4 Application of the General Procedure to Testing the Mean 
of a Normal Distribution with Known Variance 

In this section we shall consider the problem of testing the simple 
hypothesis I/o that the mean 0 of a normal distribution with known 
variance is equal to a particular value 0o- The acceptance of //© will 
not be considered a serious error if 0 9 ^ 0o hut is near ^o- However, 
there will be, in general, a positive value 5 such that the acceptance 
of Ho is considered an error of practical importance if (and only if) 
0 — 00 

^ 5, where denotes the known standard deviation of the 

distribution. Thus, the region of preference for rejection may be de- 

0-00 


fined as the set of all values 0 for which 


^ 6. The region of 


preference for acceptance will consist of the single value 0o, and the 
region of indifference will be the set of all values 0 for which 0 < 

^ <a. 

a 

The probability density of the sample (xi, under Ho is 

given by 

1 - 2 
Pon = 


(4:8) 


(2tr)2<r 


n 


According to the general theory discussed in the preceding section, 
Pin is defined as some weighted average of the probability density cor- 
responding to various values of 0 in the zone of preference for rejec- 
tion. It is shown in Section A. 8. 2 that an optimum weighted average 
IS the simple average of the two density functions: the density func- 
tion corresponding to 0 = 0o — S<r and the density function correspond- 
ing to0 = 00 -h 6a. Thus, 


(4:9) 



1 

2 



1 

(27r)^cr” 


e 


1 


2:(x« — 0o+«<7)^ 



1 

(27r)V” 


e 


1 

2<r2 





78 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


The test is then carried out as follows. We continue taking obser- 
vations as long eiS B < pi„/pon < A. If pm/Pon ^ A, we reject Hq. 
If pin/pon ^ B, we accept Hq- To make the probability of an error 
of the first kind equal to or and the maximum of (in the domain 




equal to /3, for all practical purposes we may put 


A — (1 — fB)/a and B = 0/(1 — oc). 

A more detailed discussion of this test procedure is given in Part TI, 
Chapter 9. 


4.2 Tests of Composite H 3 rpotheses 

4.2.1 Discussion of an Important Special Case 

A frequent and important problem is that of testing the hypothesis 
H that the unknown parameter 6 does not exceed a specified value 0'.^ 
This problem is of particular importance in quality control of manu- 
factured products. The importance of an error of the first kind (re- 
jection of H when H is true), or that of an error of the second kind 
(acceptance of IJ when H is false), will usually vary with the value of 
0. For example, if 6 is only slightly below 0' the rejection of H will 
not be considered a serious error. Similarly, if 0 is only slightly above 
0' the acceptance of H will not be considered a serious error. In gen- 
eral, the importance of an error of the first kind will increase steadily 
with decreasing value of 0 in the domain 0 ^ 0\ and the importance 
of an error of the second kind will increase steadily with increasing 
value of 0 in the domain 0 > 6'. Thus, it will be possible to find two 
values 00 < and 0i > 0' such that an error of the first kind is con- 
sidered of practical importance whenever 0 ^ 0o, and an error of the 
second kind is considered of practical importance whenever ^ ^ ^i, 
whereas for values 0 between 0o and 0i we do not care particularly 
which decision is made. Hence the zone of preference for acceptance 
may be defined as consisting of all values 0 ^ 0o, the zone of preference 
for rejection as the set of values 0 for which 0 ^ 0i, and the zone of 
indifference as the set of all values 0 for which 0o < ^ < such 

a situation we shall want a test procedure for which the probability 
of an error of the first kind is less than or equal to a preassigned a 
whenever 0 ^ 0o, and the probability of an error of the second kind is 
less than or equal to a preassigned 0 whenever ^ ^ In most of the 
important cases occurring in practice, such as when x has a normal, 
binomial, or Poisson distribution, and so on, the sequential probability 

® It is assumed here that there is only one unknown parameter 0 involved in the 
distribution of x. 



TESTS OF COMPOSITE HYPOTHESES 


79 


ratio test of strength (a, 0) for testing the hypothesis that 6 = 6q 
against the single alternative that 6 = 6\ will have the desired prop- 
erties and provides a satisfactory solution to the problem. If the 
sequential probability ratio test leads to the acceptance of the hypoth- 
esis that d = 0o» we accept the original hypothesis that 6 ^ d\ and if 
the probability ratio test leads to the rejection of the hypothesis that 
6 = 00 , we reject the original hypothesis that 0 ^ 0'. 

As an illustration, we shall discuss briefly one or two examples. 
Suppose that a lot consisting of a large number of units of a manu- 
factured product is submitted for acceptance inspection. We shall 
assume that each unit is classified in one of the two categories: de- 
fective and non-defective. The proportion p of defectives in the lot 
is assumed to be unknown. The preference for acceptance or rejec- 
tion of the lot will, of course, depend on the value of p. It will be 
possible, in general, to select two values of p, say po and pi (p© < pi) 
such that the rejection of the lot is considered an error of practical 
importance whenever p ^ p©, and the acceptance of the lot is an error 
of practical importance whenever p ^ pi ; for values p between p© and 
Pi we do not care particularly which decision is made. Thus, the zone 
of preference for acceptance is given by p ^ p©, the zone of preference 
for rejection by p ^ Pi, and the zone of indifference consists of values 
p for which po < P < Pi- Hence, we shall want a test procedure for 
which the probability of rejecting the lot is less than or equal to a 
preassigned value or whenever p ^ p©, and the probability of accept- 
ing the lot is less than or equal to a preassigned value 0 whenever 
P = Pi- Such a test procedure is given by the sequential probability 
ratio test of strength («, 0) for testing the hypothesis that p = p© 
against the single alternative that p = pi. To compute the proba- 
bility ratio Pin/pon for this problem, we shall denote by the number 
of defectives found in the first n units inspected. The probability of 
obtaining a sample equal to the observed one is given by 


( 4 : 10 ) 

when p = pi, and by 


Pin = Pi^'-Cl - Pi)^"^" 


Pon = Po'^'-d - Po)”-*"" 

when p = Then 


d:12) log — = cln log — (n — dn) log - — — — 

Port Po 1 — Po 

Formulas (4:10) and (4:11) arc strictly valid only if the lot contains infinitely 
many units. It is assumed that the lot contains a large number of units so that 
these formulas can be used with good approximation. 


80 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


The test procedure is carried out as follows. We continue inspec- 
tion as long as log S < log (pin/pon) < log A. If log (pi„/pon) ^ 
log A, inspection is terminated with the rejection of the lot, and if 
log (pin/pon) ^ log By inspection is terminated with the acceptance of 
the lot. For practical purposes we may put A = (1 — ^')/ci and B ~ 

^/(l - Df). 

A detailed discussion of the problem of acceptance inspection when 
each unit is classified either as defective or as non-defective is given 
in Part II in Chapter 5. 

Another example for testing a hypothesis that 6 ^ 6' is the case 
when 6 is the unkno^vn mean of a normal distribution with kno^vn 
variance.^® Again it will be possible to select two values 0o < 0' and 
> d' such that an error of the first kind is considered of practical 
importance whenever 6 ^ ^o. an error of the second kind is of prac- 
tical importance whenever 6 ^ 0i; for values 6 between and $i we 
do not care particularly which decision is made. In such a situation 
we shall want a test procedure for which the probability of committing 
an error of the first kind is less than or equal to some preassigned value 
« whenever 0 ^ ^o. and the probability of committing an error of the 
second kind does not exceed a preassigned value 0 whenever 6 0i. 

These conditions ^vill be satisfied by the sequential probability ratio 
test of strength (a, 0) for testing the hypothesis that 0 = 0o against 
the single alternative hypothesis that 0 = 0i. The probability density 
of the sample (xi, • • Xn) is given by 


(4:13) 



when 0 = 0o, and by 
(4:14) Pm = 



1 

2<r3 



1 - 2 ^^ 
n ^ 
(27r)2<r” 


when 6 = 01. We continue taking observations as long as P < 
pin/port < A. If pm/pon ^ A, wc reject the hypothesis that 0 ^ 0\ 
and if pin/pon ^ B we accept the hypothesis that 0 ^ 0\ Again, we 
put A = (1 — 0)/ce and B = 0/(1 ~ «)- 


4.2.2 Outline of the Test Procedure in the General Case 
In testing a composite hypothesis //« that the parameter point 0 lies 
in a subset a> of the parameter space, the parameter space is again 
subdivided into three mutually exclusive zones: the zone of preference 

»» This problem is discussed in detail in Part II, Chapter 7. 


TESTS OF COMPOSITE HYPOTHESES 


81 


for acceptance oja, the zone of preference for rejection cor, and the zone 
of indifference. The zone of preference for acceptance will now con- 
sist of more than one parameter point, as distinguished from the case 
of testing a simple hypothesis. 

For any test procedure the probability of an error of the first kind 
(rejecting when is true) will, in general, vary with the param- 
eter point in w. For any parameter point 0 in w we shall denote by 
a(0) the probability that will be rejected when 6 is true. Simi- 
larly, the probability of an error of the second kind (accepting //„ when 
it is false) is a function 0{d) defined for all points 0 outside w. 

According to the requirements formulated in Section 2.3.2, we shall 
want a test procedure such that a(0) will not e.xceed a preassigned 
value O' for all 0 in the zone and /3(0) will not exceed a preassigned 
value 0 for all 0 in the zone oj,.. Before discussing the problem of 
finding a proper test procedure satisfying these requirements, we shall 
again consider, as in the case of the simple hypothesis, the following 
modified problem: Let Wa(0) and WriO) be two non-negative functions 
of 0, called weight functions, such that “ 


(4:15) 


J iOaiB) do = I and | Wr{0) dO = 1 


Suppose that we wish to construct a sequential test such that the 
weighted average | cx(d)wa(d) dd of the probabilities of errors of the 
first kind is equal to a given value a, and the weighted average 

j 0{6)wr{6) do of the probabilities of errors of the second kind is a 

u)r 

given value /3. 

A proper sequential test satisfying these modified rcfiuiroments can 
be constructed as follows. Let pon and pin be defined by 


(4:16) 

and 

(4:17) 


Po 


Vi 


n — I /(xi, 01 , ■ • - , dk) • • • /(-r„, 01 , ■ - - , 0fc)u’(i 

n = I /(Xi, 01, • - - , 0fc) . • 

«/wr 


(0) dd 


/(-r„, 01 , • • •, Ok^WriO) do 


where /(x, 0 ,, ■ • ■ , 0 ^.) denotes the probability distribution of x when 
IS true. Ihe functions and pin can bo interpr(?tod as probability 
distributions of the sample (xi, - ■ - , x,,). Denote by I/o* the hypoth- 

for functions Wa(0) and ttv(0) n»ny also bo discrete. Formulas valid 

° , ^^^d^uous and discrete weight functions could be given by u.sitig Stieitjo’s 
mtograls m (4:15) and subsequent equations. 


82 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


esis that the distribution of the sample • • •, Xn) is given by (4:16), 
and by Hi* the hypothesis that the distribution of (a:i, • • Xn) is given 
by (4:17). The sequential probability ratio test of strength (or, /3) for 
testing Ho* against Hi* provides a solution to our problem. If the 
constants A and B in this sequential test are chosen so that the prob- 
ability is cc that we reject Hq* when Hq* is true, and the probability 
is jS that we accept Hq* when Hi* is true, then for this sequential test 
we have 

J Wa{G}oe{0') dd = oc 

and 


/ 





To make the strength of the test of Ho* against Hi* equal to (a, /3), 
again, for practical purposes, we may put A = (1 — /?)/« and B = 
/?/(! - «). 

To construct a sequential test procedure satisfying the requirements 


(4:18) «(^) ^ ot for all 0 in a>o 

and 

(4:19) /3(0) ^ for all $ in Wr 

we shall restrict ourselves to sequential probability ratio tests for which 
Pon and Pm are given by (4:16) and (4:17), respectively, and Wa{0) 
and Wr{0) may be any weight functions satisfying (4:15). Denote by 
C the class of all such tests corresponding to all possible weight func- 
tions Wa.i0) and Wr{0). To select a proper test from the class C which 
satisfies the requirements (4:18) and (4:19), our procedure will be sim- 
ilar to that in the cas<r of simple hypotheses, as discussed in Section 
4.1.3. A test in class C is uniquely determined by the choice of the 
constants A and B and by the weight functions Wa(0) and Wr{0)- Thus, 
the maximum of cx{6) with respect to 0 in the zone coa» as well as the 
maximum of /3(0) with respect to 0 in the zone cJr, is determined uniquely 
by A, By Wai0^, and 'Wr{0'). Denote these maxima by cc[Ay B, Wat y^A 
and &[A, B, Wa, respectively. For given values A and B, the 

weight functions Wa{0) and xvri.0) may be regarded the more desirable 
the smaller they make oc[Ay By Wu, wA and (3[A, B, Wa, to A- Thus, if it 
is po.ssible to find weight functions tOa(0) and Wr(0) for which both 
a[A, By Wa, wA and /?(A, B, Wa, tvA are simultaneously minimized, they 
may be regarded as optimum weight functions. It is sho^^^l in Section 
A. 9 that in some important special cases, such as testing the mean of 


TESTS OF COMPOSITE HYPOTHESES 


83 


a normal distribution with unknown variance, optimum weight func- 
tions of the type described above do exist. However, it is not known 
whether they generally exist. If it is not possible to minimize both 
oc{Ay B, Wat lOr] and (3[A, B, Wa, Wr\ simultaneously, it may be reason- 
able to choose Wa{0) and Wr(d) such that some average of the two 
values oi{A, By Wa, Wr\ and ^[A, By Wa, Wr^, or the maximum of these 
two values, is minimized. 

If the principle described above for choosing the weight functions 
WaiS) and Wr{B) is adopted, the maximum of a{6) in the zone a>a and 
the maximum of 0{d) in the zone a>r will depend only on A and B. 
Finally the constants A and B are determined so that these two max- 
ima are equal to a and /3, respectively. 

There is no general method yet available for constructing weight 
functions V)a{0) and Wr{6) which are optimum in the sense defined 
above. In some special cases, however, such weight functions have 
been constructed.^* 


4.2.3 Application of the General Procedure to Testing the Mean 
of a Normal Distribution with Unknown Variance (Sequen- 
tial f-Test) 

A frequent and imj^ortant problem in applications is that of testing 
the hypothesis // that the unknown mean 0 of a normal distribution 
is equal to some specified value do when nothing is known about the 
variance <r- of the distribution. If the true value 6 differs only slightly 
trom 00 , i.e., if | 0 0,, 1 is onlj’’ a small fraction of the standard devi- 

ation a, the acceptance of II will u.sually not be considered an error of 
practical conseciuence. However, the importance of an error committed 
by accepting II when 0 0o will, in general, increase with increasing 

S I 

• Thus, it will be possil>le to find a positive value 


value of 


« 

6 .such that the acceptance of II is considered an error of practical 

' ^ I . 

Accordingly, the three zones in 


importance only when 


> 6 . 


the parameter space will be defined as follows. The zone of prefer- 
ence for acceptance consists of all points (0, <j) for which 0 = Sq, i.e., 
consists of all points (0„, <r) where <r can take any positive value. 


The zone wr of 

U • T I ^ ””” 

which 


points (0, <t) for which 0 < 
See Section A. 9. 


^reference for rejection consists of all points (0, <r) for 
^ 5. Finally the zone of indifference contains all 

0 - 0 „ 


< 5 . 


84 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


The probability density of a sample (xi, ♦ - x„) drawn from a nor- 

mal distribution with mean 6 and standard deviation <r is given by 


(4:20) 


Pn = 


(27r)V 


2<r2 


2 


a— 1 


As in the general procedure described in the preceding section, the test 
procedure will be based on the ratio pm/pon where pon is some weighted 
average value of pn corresponding to various points (6, tr) in coa, and 
pin is some weighted average of pn corresponding to various points 
(0, cr) in <Or- It is showm in Section A. 9 that by choosing the weight 
functions Wa(0) and Wr(0) according to the principles described in the 
preceding section we are led to the following ratio: 


(4:21) 


Pin 

POn 



n 


2<r^ 


— 00 — 
I 


+ e 


n 

1 


1 do- 



The test procedure is then carried out as follows. Additional observa- 
tions are taken as long qjs B < Pm/Pon < A. The hypothesis H is 
rejected if pm/pon ^ A and the hypothesis H is accepted if pin/Pon 
^ B, To satisfy the requirements (4:18) and (4:19) for practical pur- 
poses W’e may let A = (I — 0)/ot and B — (3/{\ — a). 

4.2.4 A Particular Class of Problems Treated by Girshick 

A class of problems treated by !M. A. Girshick may be formulated 
as follows. Let xi and xg be two independent random variables. The 
distribution (elementary probability law) of .ti is given hy 
and that of X 2 by /(X 2 , ^ 2 ). where the function / is known but the values 
of the parameters 0i and 02 a-re unknown. The problem is to test the 
hypothesis // that 0i ^ 0> against the alternative hypothesis //' that 

01 > 02 - 

The type of problem described above occurs frequently in applica- 
tions. For example, let x denote some quality characteristic, such as 
hardness, tensile strength, or weight, of a manufactured product. Sup- 

Considerable work on the evaluation of this ratio to bring it to a suitable 
form for tabulation was done by K. Arnold while he was a member of the Statistical 
Research Group of Columbia Univcr.sity. Tables for the computation of this ratio 
have bt;en prepared by the Mathematical Tables Project, New York. 

•* M. A. Girshick, “Contribution.s to the Theory of Sequential Analysis,'* The 
Annals of ^Inihciyiaiical Statistics. I ol. 17 (1946). 


TESTS OF COMPOSITE HYPOTHESES 


85 


pose that the distribution of x in the population of units produced has 
a known functional form f(x, 0), but the value of the parameter d is 
unkno^vn. Suppose, furthermore, that there are two competing proc- 
esses of production under consideration by the manufacturer. Let $i 
denote the value of 6 when process 1 is used, and ^2 when process 2 
is used. Both values, 61 and 62 , are unkno^vn. If the product is con- 
sidered the more desirable the greater the value of 0, the problem of 
deciding between the two competing processes reduces to that of test- 
ing the hypothesis H that di ^ 62 . Process 1 is chosen if /f is rejected, 
and process 2 is chosen if H is accepted. 

The following procedure for testing the hypothesis H has been pro- 
posed by Girshick. We choose a particular value of di and a par- 
ticular value $ 2 ^ of 62 where < $ 2 ^. Let Hq denote the hypothesis 
that the joint distribution of Xi and X2 is given by /(xi, di®)/(x2, ^2®)» 
and let Hi be the alternative hypothesis that the joint distribution of 
Xi and X 2 is given by /(xj, 62 ^)f(x 2 , ^i®)- We then set up the sequen- 
tial probability ratio test for testing the simple hypothesis Hq against 
the simple alternative Hi. The hypothesis H is accepted or rejected 
accordingly as the sequential probability ratio test leads to the accept- 
ance or rejection of Hq. Thus, to carry out the test procedure, two 
constants A and B are chosen and the ratio 


(4 ;22) 


^ g2^)/(X2I, ‘ • ■ /(Xim. 02^)KX2m, 

/(Xii, 01°)/(X21, 62 ^) ' ' • ^l^)/(X2m, ^2®) 


Po 


IS computed at each stage of the experiment. Here Xta denotes the 
octh observation on x, (f = 1,2). It is assumed that the observations 
are taken in pairs, where each pair consists of an observation on Xi 
and an observation on X 2 . E.xperimentation is continued as long as 
the ratio p\ 7 n/pom lies between B and A. The hypothesis H is accepted 
Pim/pom ^ B, and the hypothesis H is rejected if pim/Pom ^ A. 

It has been showm by Girshick that in many important cases the 
above test procedure will have the following property: There exists a 
unction v = v{di, 62 ) such that v may be regarded as a reasonable 
measure of the difference between di and 62 , and the probability of 
accepting H depends only on the value of v. The function v satisfies, 
furthermore, the conditions: (1) v{di, 62 ) = 0 when Oi = $ 2 ; (2) v{di, 62 ) 
< 0 when 62 > Oi; (3) v(di, 62 ) = -v(e 2 , 0,)- 

a lunction v with the above properties exists, the choice of the 
our quantities and B may be made on the basis of the foi- 

5 be a j)ositive value .such that the accept- 
ance o H is regarded as an error of practical importance whenever 


86 


TESTS FOR SIMPLE AND COMPOSITE HYPOTHESES 


V ^ S, the rejection of H is regarded as an error of practical importance 
whenever v ^ — 5 ; for values v between — 6 and 5 we do not care par- 
ticularly which decision is made. Thus, we shall want a test procedure 
for which the probability of rejecting H will not exceed a preassigned 
value a whenever v ^ — d and the probability of accepting H will not 
exceed a preassigned value /3 whenever v ^ S. The test procedure will 
have the desired properties if the quantities 62^7 a,nd B are 
chosen so that $ 2 ^) — — ^ and the sequential probability ratio 

test for testing Hq against IIi has the strength (a, 0). For all prac- 
tical purposes we may let .4 = (1 — /3)/o: and B = /3/(l — tx). 

As an illustration, we shall consider the following example. Suppose 
that one of two production processes is to be chosen. Suppose, further, 
that the quality characteristic under consideration is normally distrib- 
uted with known mean and unknown standard deviation ai when proc- 
ess 1 is used, and that the distribution is normal with the same mean 
but unknown standard deviation 0-2 when process 2 is used. The proc- 
ess that leads to a smaller standard deviation is preferred. Thus, the 
manufacturer is interested in testing the hypothesis H that <ti S <^ 2 - 
There is no loss of generality in assuming that the known means are 
equal to O. Let Hq be the hypothesis that <ri ai® and < 7-2 = o’ 2 ®» 

Hi the hypothesis that tri = (T 2 ” ^^nd 0-2 = (o'!® < o' 2 *^). Then the 

probability ratio for testing Hq against Hi is given by 

plm “ 2 ( 0 - 20)0 ‘ 2 /**“*” 

(4:23) — = e 

POm 

where Xt« denotes the ath observation from the population correspond- 
ing to process i. 

As Girshick has shown, the probability that the sequential prob- 
ability ratio test of Ho against Hi will terminate with the acceptance 
of Ho depends only on the value of 


(4:24) 


v{<ri, (T2) 


1 

2 




This quantity may be regarded as a reasonable measure of the devi- 
ation of <ri from < 72 . Suppose we want a test procedure satisfying the 
following conditions: The probability of rejecting H should not exceed 

cc whenever ^ — 5, and the probability of accepting H 

2 \<r2'^ cTi / 

should not exceed whenever -( — ^ 2 / = Then we choose 

c \a'2 / 



TESTS OF COMPOSITE HYPOTHESES 


87 


<71® and 0 - 2 ® so that 
(4:25) 


1 L_^ 

2 \(<r2»)* (a,°)V 


= -d 


The probability ratio given in (4 :23) becomes then equal to 


m 


(4:26) 


Plm * 2 

= e I 

POm 


When -log^ - 2 2 

^ POm 


— S(xia2 _ X 2 a^) is used instead of 


Pim 

POm 


I the test 


procedure can be carried out as follows. We continue taking pairs of 
observations as long as 


(4:27) 

We accept II if 
(4:28) 

and reject H if 
(4 ;29) 


log B 


m 


< 


“ ^2a^) < 


log A 


a * I 


1 6 


m 

- X 2 J) ^ 


log ^ 


1 


£ 



PART II, APPLICATION OF THE GENERAL THEORY TO 

SPECIAL CASES ^ 


Chapter 5, TESTING THE MEAN OF A BINOMIAL DISTRI- 
BUTION (ACCEPTANCE INSPECTION OF A LOT WHERE 
EACH UNIT IS CLASSIFIED INTO ONE OF TWO CATEGORIES) 

6.1 Formulation of the Problem 

Let a: be a random variable which can take only the values 0 and 1. 
Denote by p the (unknown) probability that x takes the value 1. We 
shall deal here with the problem of testing the hypothesis that p does 
not exceed some specified value p'. 

This problem arises, for example, in acceptance inspection of a lot 
consisting of a large number of units of a manufactured product. Sup- 
pose that each unit is classified in one of the two categories: defective 
and non-defective. We shall assign the value 0 to any non-defective 
unit and the value 1 to any defective unit. Let p denote the unkno\vn 
proportion of defectives in the lot. Then the result x of the inspection 
of a unit drawn at random from the lot can take only the values 1 
and 0 with probabilities p and 1 — p, respectively. Usually it will be 
possible to specify some value p' such that we would like to accept the 
lot whenever p ^ p* and we would like to reject the lot whenever 
p > p'. Thus, the problem of deciding whether the lot is to be ac- 
cepted or rejected on the basis of a random sample may be formulated 
as the problem of testing the hypothesis p ^ p' against the alternative 
hypothesis that p > p'. 

Since acceptance inspection of manufactured products is perhaps 
one of the most important applications of testing the mean of a bi- 
nomial distribution, in what follows we shall use the terminology cus- 

• The special cases treated here are discussed mainly to illustrate the general 
theory and to bring out points of theoretical interest specific to these application^ 
Accordingly, computational procedures and simplifications are not stressed much 
and hardly any tables are given. A more detailed and non-mathematical discu^ion 
of these applications, together with a number of tables, charts, and computational 
simplifications, is contained in “Sequential Analysis of Statistical Data: Applica- 
tions," a report prepared by the Statistical Research Group of Columbia Universi^ 
and published by Columbia University Press, Sept., 1945. This report will be 

referred to hereafter simply as SRG 255. 

88 



TOLERATED RISKS OF MAKING WRONG DECISIONS 


89 


tomary in acceptance inspection. This, of course, does not mean that 
the test procedure is not applicable to other cases as well. In the 
terminology of acceptance inspection, our problem may be stated as 
follows: A proper sampling plan (test procedure) is to be devised for 
deciding whether the lot submitted for inspection should be accepted 
or rejected. 

5.2 Tolerated Risks of Making Wrong Decisions 

Any sampling plan which does not provide for complete inspection 
of the lot may lead to a wrong decision. That is, we may accept the 
lot when p > p', or we may reject the lot when p ^ p'. Since com- 
plete inspection is frequently not feasible, or too costly, we are willing 
to tolerate some risks of making wrong decisions. In order to devise 
a proper sampling plan, it is necessary to state the maximum risks of 
wrong decisions that we are willing to tolerate. 

If p = p', the quality of the lot is just on the margin and we are 
indifferent which decision is made. Por p >• p', we prefer to reject the 
lot and this preference increases with increasing value of p. For p < p', 
we prefer to accept the lot and this preference increases with decreas- 
ing value of p. If p is only slightly above the preference for rejec- 
tion is only slight and acceptance of the lot will not be regarded as an 
error of practical consequence. Similarly, if p is only slightly below 
p , rejection of the lot is not a serious error. Thus, it will be possible 
to specify two values po and pi, po below p' and pi above p', such that 
acceptance of the lot is regarded as an error of practical consequence 
if (and only if) p S pi, and rejection of the lot is regarded as an error 
of practical importance if (and only if) p ^ pQ. If p lies between po 
and Pi we do not care particularly which decision is made. 

After the two values po and pi have been chosen, the risks of mak- 
ing wrong decisions which we are willing to tolerate may reasonably 
be formulated as follows: The probability of rejecting the lot should 
not exceed some small preassigned value a whenever p ^ po, and the 
probability of accepting the lot should not exceed some small pre- 
assigned value 0 whenever p ^ Pi. 

Thus, the tolerated risks are characterized by four numbers, po, Pi, 
a, and The choice of these four quantities is not a statistical prob- 
lem. They will be selected on the basis of practical considerations in 
each particular case. A proper sampling plan can be determined, as 

will be shown in the next section, after these four quantities have been 
chosen. 



TESTING THE MEAN OF A BINOMIAL DISTRIBUTION 


90 

6.3 The Sequential Probability Ratio Test Corresponding to the 
Quantities pot Pit a, and p 

6.3.1 Derivation of Algebraic Formulas for the Test Criterion 

A sampling plan satisfying the conditions that the probability of 
rejecting the lot does not exceed ce whenever p ^ Po» a.nd the prob- 
ability of accepting the lot does not exceed /S whenever p ^ Pi, is given 
by the sequential probability ratio test of strength {«, /3) for testing 
the hypothesis p = Po against the hypothesis p = pi. This test is 
defined as follows (see Section 3,1): Let Xi denote the result of the 
inspection of the ith unit; that is, = 1 if the tth unit inspected is 
found defective, and x,- = 0 otherwise. If p denotes the proportion 
of defectives in the lot, the probability of obtaining a sample equal 
to the observed (xi, • • •, Xm) is given by 

(5:1) p*"(l - 


where dm denotes the number of defectives in the first m units in- 
spected.* Under the hypothesis that p = Pi the probability (5:1) be- 
comes equal to 


(5:2) 


Plm = - Pi)" 


and under the hypothesis that p ^ po the probability (6:1) becomes 
equal to 


(5:3) 


POm = Po’^Cl — Po) 


m 


The sequential probability ratio test is carried out as follows. At each 
stage of the inspection, at the inspection of the mth unit for each posi- 
tive integral value m, we compute 


(5:4) 


, Plm , , Pi . . , N 1 1 — Pi 

log « dm log h (m — dm) log 

POm Po 1 — Po 


Inspection is continued as long as 


(5:5) 


log 


0 


, Plm 1 ^ 

log < log 


I — a POm « 

• The lot is assumed to be stifficiently large so that the successive observations 
xi, X2, • • •, etc., may be regarded as independent. 



THE SEQUENTIAL PROBABILITY RATIO TEST 


91 


Inspection is terminated the first time that (5:5) does not hold. If 
at this final stage we have 


(5:6) 


the lot is rejected, and if 
(5:7) 


1 Pi”* ^ , 1-/3 

log ^ log 


POm 


at 


I Plm . 

log ^ log 


|8 


Po 


m 


1 - 


a 


the lot is accepted.® 

Inequalities (5:5), (5:6), and (5:7) can easily be seen to be equiva- 
lent to the follo\ving inequalities: 


(5:8) 


log 


0 


1 - a 


log 


, Pi - 1 — Pi 

log log 

Po I — Po 


H- m 


1 — Po 
1 — Pi 


, Pi , 1 — Pi 

log log 

Po 1 — Po 


<dm< 


and 


log 


1 - 


oc 


log 


Pi 1 “ Pi 

log log 

Po 1 — Po 


-f- m 


I — Po 

1 — Pi 


Pi . 1 — Pi 

log log 

Po 1 — Po 


log 


1 - 0 


(5 :9) ^ 


CL 


log 


, Pi , 1 — Pi 

log log 

Po 1 “ Po 


m 


1 — Po 
1 — Pi 


, Pi , 1 — Pi 
log log 

Po I — Po 


log 


0 


(5:10) ^ 


1 - a 


log 


, Pi , 1 — Pi 

log log 

Po 1 — Po 


H- m 


1 — Po 
1 — Pi 


, Pi , 1 — Pi 

log log 

Po 1 — Po 


For each value of m we shaU denote the right-hand member of (5:10) 
by am and call it acceptance number. Similarly, we shall denote the 

* There is a slight approximation involved in the use of the constants log [a/(\ — 
and log ((1 — a)/cc\. For further details see Section 3.3. 



92 


TE&riNG THE MEAN" OF A BINOMIAL DISTRIBUTION 


right-hand member of (5:9) by and call it rejection number. For 
purposes of practical computations, the use of the inequalities (5:8), 
(5:9), and (5:10) seems to be much more convenient than the use of 
the original inequalities (5:5), (5:6), and (5:7).^ On the basis of in- 
equalities (5:8), (5:9), and (5:10), the sequential probability ratio test 
is carried out as follows. At each stage of the inspection we compute 
the acceptance number Om. and the rejection number Inspection 
is continued as long as CLm < dm < r,n* The first time that dm does not 
lie between the acceptance and rejection numbers, inspection is termi- 
nated. 'iS. dm ^ Tm the lot is rejected, and if ^ the lot is ac- 
cepted. 


5.3.2 Tabular Procedure for Carrying Out the Test 
The acceptance number 


(5:11) 



and the rejection number 


(5:12) 



depend only on the quantities po» Pit ^rid 0. Thus, they can be 
computed and tabulated before inspection starts. If Om is not an 
integer, we may replace it by the largest integer < Om. Similarly, if 
Tm is not an integer, we may replace it by the smallest integer 

As an illustration, consider the follomng example. Let po — 
s= ,3^ « = .02, and /9 = .03. The acceptance and rejection num- 
bers, as well as the results of the observations, in an experiment are 

* The use of the inequalities (5:8), (5:9), and (5:10) instead of (5:5), (5:6), and 
(5:7) was first suggested by J. H. Curtiss. In SRG 255 similar transformations 
of the inequalities defining the test procedure have been used in other problems 
as well. 


THE SEQUENTIAL PROBABILITY RATIO TEST 93 

given in Table 5. In this example, inspection is terminated at m — 
22 with the rejection of the lot. 


TABLE 5 


m 

Number 
of Units 
Inspected 

Acceptance 

Number 

dm 

Number 
of Defects 
Observed 

Rejection 

Number 

1 


0 

• « 

2 


0 


3 


1 

« # 

4 


1 

4 

5 


1 

4 

6 


1 

4 

7 


1 

5 

8 


1 

5 

9 


2 

5 

10 


2 

6 

11 


3 

6 

12 


4 

6 

13 


4 

6 

14 

0 

5 

6 

15 

0 

5 

6 

16 

0 

5 

6 

17 

0 

5 

7 

18 

0 

6 

7 

19 

0 

6 

7 

20 

1 

6 

7 

21 

1 

6 

7 

22 

1 

7 

7 

23 

1 


8 

24 

1 


8 

25 

2 


8 

26 

2 


8 

27 

2 


8 

28 

2 


9 

29 

2 


9 

30 

3 


9 


6.3.3 Graphical Procedure for Carrying Out the Test 

The test procedure can also be carried out graphically. The num- 
ber m of observations is measured along the horizontal axis and the 
number of defects along the vertical axis. The points (rn, Om) lie 
on a straight line Lq, since is a linear function of m. Similarly the 



94 TESTING THE MEAN OF A BINOMIAL DISTRIBUTION 


points (m, O lie on a straight line Li. The intercept of Lq is given by 


log 


0 


(6:13) 


ho = 


1 — 


Ot 


log ^ - log 

Vo 1 — Po 


and the intercept of Li is given by 


log 


1 - 0 


(5:14) 


h^ * 


OC 


, Pi , 1 — Pi 

log log 

Po 1 — po 


The lines Lq and Lx are parallel and the common slope is equal to 


(5:15) 



The two straight lines Lq and Lx are drawn before inspection starts. 
The points (m, dm) are plotted as inspection goes on. We continue 
inspecting additional units as long as the point (rn, dm) lies between 
the lines Lq and Lx. Inspection is terminated the first time that the 
point (m, dm) does not lie between the lines Lq and Lx- If (m, dm) lies 
on Lq or below, the lot is accepted. If (m, dm) lies on L\ or above, the 
lot is rejected. 

Figure 11 shows the graphical procedure for the example given in 
Section 5.3.2. 



Fia. 11 


THE OPERATING CHARACTERISTIC FUNCTION 


95 


5.4 The Operating Characteristic (OC) Function L<J>y of the Test ® 

6.4.1 Determination of L(J>) for Some Special Values of p 

As defined in Section 2.2.1, the value of the OC function Z*(p) for 
each p is equal to the probability that the lot will be accepted when 
p is the true proportion of defectives in the lot. One can easily verify 
that 


(5:16) 


L{G) = 1 and L{1) = 0 


Since the test procedure is so set up that the probability is 1 — « 
that the lot will be accepted when p = po» and the probability is 
that the lot will be accepted when p = pi, we have 

(5:17) -^(Po) = 1 — « and -£/(pi) = 


When 



we obtain from equation (3 :43) 


(5:18) 


L{8) 





hi 

hi “h ho 


where ho and hi are the intercepts of the lines Lq and Z.i.® 

Thus, five points on the OC curve corresponding to p = 0, 1, po, pi, 
and s can immediately be determined. Since L(p) is monotonically 
decreasmg with increasing p, the five points will determine fairly 
closely the shape of the whole OC curve. This 'will frequently be suf- 
ficient for practical purposes and there will be no need to compute L(v) 
for additional values of p. 


fhl given in this section involve an approximation caused by neglecting 

« boundaries and at the termination of the test proc^ 
fn^' / \ ^ Sections 3.4 and A.2.3. An exact formula for L(p) is given 

to 1 ® decision lines is equal 

to the reciprocal of an integer. ^ 

“ When p = a, the value of h in formula (3 ;43) is equal to 0. The Umiting value of 

the right-hand member of (3:43), when A - 0. is equal to which is 

equal to the right-hand member of (6:18), since A = (1 - S =^/(l - «). 



96 


TESTING THE MEAN OF A BINOMIAL DISTRIBUTION 


5.4.2 Determination of L(^p') over the Whole Range of p 

It has been shown in Chapter 3, equations (3:45) and (3:46), that’ 


(5:19) Z.(p) = 


where h is determined by the equation 


( 5 : 20 ) 



To compute the OC curve, it is not necessary to solve equation 
(5:20) in h. For any arbitrarily chosen value A, the values of p and 
Lip) may be computed from (5:19) and (5:20). The point [p, Lip)] 
computed in this way will be a point on the OC curve. The OC curve 
can be drawn by plotting a sufficiently large number of points [p, Lip)] 
corresponding to various values of h. Figure 12 shows a typical OC 
curve. 





The range of h in (5:19) and (5:20) is from — «> to It can be 

verified that the right-hand member of (5:19) is increasing with in- 
creasing h, and the right-hand member of (5:20) is decreasing with in- 
creasing k. The five values of p considered in Section 5.4.1, that is, 
p = 0, po, s, pi, 1, correspond to the values of ^ = + «, 1, 0, —1, — 
respectively, as can be seen from (5:20). Letting A = + «, 1, O, — 1, 

’ In the formulas given in SRG 255, p. 2.50, the quantities j> and L(p) are ex- 
pressed in terms of another parameter x which is functionally related to h. 


THE OPERATING CHARACTERISTIC FUNCTION 


97 


— CO in (5:19), we obtain the corresponding five values of Lip) which 
coincide with those given in Section 5.4.1. 

If the part of the OC curve corresponding to positive values of h 
has been determined, the computation of the part of the OC curve 
corresponding to negative values of h can be simplified.® To show this, 
let k he a. given positive value and let [p, L{p)] be the corresponding 
point on the OC curve. Let [p', L(p')] denote the point on the OC 
curve corresponding to — k. Then we have 


(5:21) Lip') 


Similarly, 


(5:22) p' = 




• A similar simplification is given in SRG 255, p 
parameter x used there. 


2.50, with reference to the 



98 


TESTING THE MEAN OF A BINOMIAL DISTRIBUTION 


Thus, the point [p\ Z/(p')] corresponding to —h can be computed from 
the point [p, L(p)] corresponding to h by using the simple relations 

= (S) ^ 

6.4.3 Exact Formula for When the Reciprocal of the Slope of 
the Decision l^ines Is an Integer 

The quantity i.e., the logarithm of the probability ratio for 
a single observation, can take only the values log (pi/po) and 
log f(l — pi)/(l — po)]* It follows from (5:15) that 



where s is the slope of the decision lines. Assume that 1/s is an in- 
teger. Then the two values of z are integral multiples of d = 
log [(1 — po)/(l — Pi)]» namely, —d and [(1/s) — l]d, and the results 
in the last part of Section A. 4 can be used to determine the exact OC 
curve.® On the basis of these results one can show that 


L(P) 



where A and B are the constants used in the sequential test,^® the 
symbol [A:] denotes the smallest integer ^ k, and Ui, U 2 , • * *, Wi are the 
roots of the equation • 

(1 — p)w H- p —rzi “ ^ 


A different method for deriving an exact formula for L{p) was given 
by M, A. Girshick in The Annals of Mathematical Btatisiics^ Vol. 17 
(1946). His method does not require the computation of the roots 

t/i, • • Wi- 

% 

* To reduce this case to the case discussed in the last part of Section A. 4, one 
merely has to consider the test corresponding to z*, A*, and B* where z* = —z, 
log A • = — log B and log B* = — log A . 

>0 To obtain a test of strength (a, /3), we used the approximate values A 
(1 — /9)/a and B «= ^/(l <»)• 


THE AVERAGE SAMPLE NUMBER FUNCTION 


09 


6.5 The Average Sample Number (ASN) Function of the Test 

Let n denote the number of observations required by the test pro- 
cedure. Then n is a random variable, since it depends on the outcome 
of the observations. The expected value of n depends on the propor- 
tion of defectives in the lot and is denoted by Epin). This can be 
plotted as a curve, p being measured along the horizontal axis and 
Ep{n) along the vertical axis. A typical ASN curve is shown in Fig. 
13. This curve is called the ASN curve of the test (see Section 2.2.2 
for a general definition of the ASN curve). 



The general formula for the ASN function of a sequential probability 
ratio test is derived in Section 3.5. The approximation formula (3:57) 
applied to the binomial case gives 


(5:23) 


Ej,in) 


Liv) log g + (1 — L(p)) log A 

1 1 /t X 1 ^ — Pi 

p log h (1 — p) log 

Po 1 — Po 


where A — (1 — 0)/ct, B = 0/{l — a), and L{p) denotes the prob- 
ability that inspection terminates with the acceptance of the lot. 
Using this formula, we shall compute Ep(n) for p = 0, po, pi, and 1. 
Since Z/(0) = 1, the value of £*^( 71 ) is given by 


log 


0 


(5:24) 


Ep(n) = 


1 — 


at 


log 


1 — Pi 

1 — Po 


“The right-hand member of (5:23) can be exprc&sed as a function of L(p). 
intercepts, and the .slope of the decision lines. See SRG 255, p. 2.63. 


the 


100 TESTING THE MEAN OF A BINOMIAL DISTRIBUTION 


when p 
(5 :23) 


0. For p — Po, we have Lip) == 1 — ot and we obtain from 


.1 M ^ . , 1 - 

(1 — Oi) log f- ot log 


(5:25) 


EM = 


1 — or 


or 


po log — + (I — Po) log ^ 

Po 1 — Po 


For p = Pi, we have L(p) = ^ and we obtain from (5:23) 


^ 1 
(3 log 3 h (1 — ff) log - 


/3 


(5 :26) 


= 


1 — 


or 


or 


7 1 /, N 1 1 ~ Pi 

Pi log b (1 — Pi) log 

Po 1 — Po 


Since L{1) = 0, we obtain from (5:23) 


log 


1 - 0 


(5:27) 


EM = 


or 


log 


Pi 

Po 


when p = 1. 

Using formula (A:99) in the Appendix, we can compute the value 
of Epin) when p is equal to the common slope s of the acceptance and 
rejection lines, i.e., when ** 

, 1 — Po 

log 


p = 


1 — Pi 


From (A:99) we obtain 


, Pi . 1 - Pi 

log log 

Po 1 — Po 


= s 


(5 :28) 


EM = 


- r^) 


EM) 


where Et{z^) is the expected value of and 2 is a random variable 
which can take only the values log (pi/po) and log [(1 — pi)/(l — po)J 

** The value « of p corresponds to the value 6' in formula (A;99). It can be shown 
that s lies between po and pi. Formula (A:99), and therefore also (5:28), involves 
an approximation caused by neglecting the excess of the cumulative sum over the 
boundaries. 


OBSERVATIONS TAKEN IN GROUPS 


101 


with probabilities s and 1 — s, respectively. Thus 

Po/ 


(5:29) = s (^log d “ «) (log ^ 


■* [(■'*£) -O”® I 


— Pi 

— po 


) ] + (iogi 


I Pi ; I 

log h log 

Po 


—1 

- Po/ 

1^) + (lo. 

1 — Pq/ \ 1 — Vn/ 


Pi 1 — Po 

= log — log 

Po 1 — Pi 

From (5:28) and (5:29) we obtain 


(5:30) 


E,{n) = 


- (log 


1 - /3 


oc 


) 


, Pi , 1 — Po 

log log 

Po 1 — Pi 


The determination of the five points of the OC curve, as given in 
(5.24), (5:25), (5:26), (5:27), and (5:30), may frequently suffice in 
practice, since these five points already give a fairly good idea of the 
shape of the whole curve. The ASN curve generally increases as p 
increases from 0 to po. and decreases as p increases from pi to 1. In 
the interval (po, pi) the ASN curve generally increases as p increases 
^om po to some value p\ and decreases as p increa.ses from p' to pj. 
The value p' is generally equal to s or is very near 5 . 

If it is desired to plot the ASN curve over the whole ran‘»c of p it 
is necessary first to compute the OC function Up). The value’ of 
Ep{n) can then easily be determined from (5:23) for any value p. 

6.6 Observations Taken in Groups 

6.6.1 General Discussion 

For practical reasons it may sometimes be preferable to take the 
observations m groups, rather than singly. That is, the test procedure 
i^s carried out ^ follows. A group consisting of units is dra\\m 
from the ot If the number of defectives d. in this group g, is less than 
or equal to the acceptance number a,., inspection torminate.s with the 
acceptance of the lot. If d, is greater than or etpial to the rejection 
number r,., inspection terminates with the rejection of the lot If 



102 TESTING THE MEAN OF A BINOMIAL DISTRIBUTION 


av < dv <. Tv, a second group of v units is drawn. Again, the lot is 
accepted if the total number of defectives d 2 v in the two groups is less 
than or equal to a 2 v, the lot is rejected if d 2 v = and a third group 
<73 of V units is drawn if 03 v < d 2 t> < ray. This process is continued 
until either rejection or acceptance of the lot is decided. Thus, when 
the observations are taken in groups of v units, the number dm of 
defectives found is compared with the corresponding acceptance num- 
ber am and rejection number ?•,„ only for m = v, 2y, Zv, • • etc. 

The purpose of this section is to make some comments on^he effect 
of groui>ing on the OC and ASN curves of the sampling plan. Clearly, 
grouping can only increase the number of observ-^ations required by the 
test. For, suppose that inspection terminates at the nth unit when 
observations are taken singly. If n is equal to an integral multiple of 
r, i.c., n = ku, then the number of groups inspected, when observations 
are taken in groujis, will be precisely equal to k, and the total number 
of units inspected will be the same as when observations are taken 
singly. However, if kv < n < (A: H- l)y, grouping will cause an in- 
crea.se in the amount of inspection, since we shall have to inspect at 
least (A; H- 1) groups, tliat is, at least {k + l)e^ unihs. It may even 
happen that we shall have to insi>ect more than (A: + 1) groups. This 
will be the cxisc when lies outside the interval (cin, ^n), but a(A:-j-i)v 
< rf(A,_t-i)y < Thus, the increase in the expected number of 

units inspected caused by grouping may even exceed v in some cases. 

Ilcgarding the effect of grouping on the OC curve, the follow’ing 
remarks may be made. Putting A = (1 — /3)/« and B = — a), 

the probability oc' of rejecting the lot when p = po and the probability 
(3' of accepting the lot when p = pi will be only approximately equal 
to <x and /3, respectively, even if the observations are taken singly. 
This was pointed out in Section 3.3, where the following inequalities 
were derived: 

Of 

(5:31) a' ^ and 0' ^ — 

I — iS 1 — ce 

It can easily be veriiied that those inequalities also remain valid when 
the observations aie taken in groups. The quantities oc and 0 are 
usually very small and o:/(l — 0) and 0/{l — a) are very nearly equal 
to or and 0, respectively. Thus, also in case of grouping, the realized 
values <x' and 0' cannot exceed the intended values a and 0, respec- 
tively, except by an exceedingly small quantity which can be neglected 
for all practical purpo.ses. This means that, for all practical purposes, 
grouping will not decrease tlie j^rotection against wrong decisions pro- 
vided by the test. The only possible effect of practical significance that 



OBSERVATIONS TAKEN IN GROUPS 


103 


may be caused by grouping is that it may make o' or 0' substantially 
smaller than the intended values a and 0. This feature of grouping 
compensates, to some extent, for the increase in the number of ob- 
servations. 

It may be of interest to remark that, if the number v of units in a 
group is equal to the reciprocal of the common slope s of the accept- 
ance and rejection lines and if the intercepts of these lines are integers, 
the OC curve is not affected at all by grouping.^® This can be seen as 
follows: Because s = 1/d, we have Om+d = 0^ + 1 and == 

Tm -f- 1. Furthermore, since the intercepts of the acceptance and re- 
jection lines are assumed to be integers, and r,„ have integral values 
for any m which is an integral multiple of v. If item-by-item inspec- 
tion leads to acceptance of the lot at the nth item, then n must be an 
integral multiple of v, and therefore inspection in groups of v will also 
lead to acceptance. If item-by-item inspection leads to rejection of 
the lot at the nth item, then we have d„ ^ r„. Let n' be the smallest 
integral multiple of v greater than or equal to n. Then since 

dr, is an integer, d„ — rn ^ 1, and — r„ ^ 1. Hence ^ and 
inspection in groups will also terminate mth rejection of the lot. Thus, 
inspection in groups leads to exactly the same decision as item-by- 
item inspection and consequentlygrouping does not affect the OC curve. 


6.6.2 Upper and Lower Limits for the Effect of Grouping on the 
OC and ASN Curves 

Upper and lower limits for the effect of grouping on the OC and ASN 
curves can be obtained by considering the follo^ving three auxiliary 
sequential sampling plans. Let ho be the intercept of the acceptance 
line, ;ii the intercept of the rejection line, and s the common slope in 
the given sampling plan. The first auxiliai-y plan is obtained by 
changing ho to ho* = ho — vs and leaving hi and s unchanged. The 
second auxiliary plan is obtained by changing hi to Aj* = Aj + vs, 
leaving A© and s unchanged. Finally, the third auxiliary plan corre- 
^onds to the intercepts ho*, hi*, and slope s. Let L,(p) denote the 
OC function and (n) the ASN function of the auxiliary plan i, when 
item-by-item inspection is used (i = 1. 2, 3). Furthermore, let L(p) 
denote the OC function and Epin) the ASN function of the given plan 
when item-by-item inspection is used. When inspection is made in 

groups the OC and ASN functions are affected, and we shall denote 
them by L(p) and Ep(n) respectively. 



” See also SRG 255, p. 2.30. 

“ Except, in the case of the OC function, when the number of 
the reciprocal of the slope*, a-s stated in Section 5.6.1. 


units in the group 



104 TESTING THE MEAN OF A BINOMIAL mSTRIBUTION 

It can easily be seen that whenever the first auxiliary plan (using 
item-by-item inspection) leads to the acceptance of the lot, the orig- 
inal plan (taking observations in groups) also leads to acceptance. 
The converse is, however, not necessarily true. That is, it may happen 
that the auxiliary plan leads to rejection of the lot, whereas the original 
plan leads to acceptance. Thus, we have 

(5:32) Z,i(p) ^ Lip) 

Similarly, one can verify that whenever the second auxiliary plan 
(using item-by-item inspection) leads to rejection of the lot, the orig- 
inal plan (using grouping) also leads to rejection. Hence 

(5:33) I - L2ip) ^ I - Lip) 

This can be written as 

(5:34) Lip) ^ L2ip) 

From (5:32) and (5:34) we obtain 

(5:35) Liip) ^ Lip) ^ L^ip) 

To derive an upper limit for Spin) we shall make use of the third 
auxiliary plan. If this plan (using item-by-item inspection) terminates 
at the inspection of the nth unit, the original plan (using grouping) 
must terminate at the latest with the inspection of the group in which 
the nth item is included. Hence, the number n' of units inspected 
when the original plan is used cannot exceed n + v. From this it 
follows that 

(5;3G) Lpin) ^ Spain) + v 

Since Spin) ^ Epin), we obtain the limits 

(5:37) Spin) ^ Epin) ^ Spain) + v 

Limits for Lip) and Epin) could also be derived by using the method 
described in Sections A. 2.3 and A.3.1 of the Appendix. The limits 
given in (5:35) and (5:37) will be rather close when pi/po and 
(1 — Pi)/(1 — Po) are near 1 and vs does not exceed 1. 

6.7 Truncation of the Test Procedure 

The sequential sampling plan does not provide any definite upper 
bound for the number n of units to be inspected. Any large value of 
n is possible, but the probability is small that n will exceed twice or 

It is possible, of course, that inspection terminates with an earlier group. 



TRUNCATION OF THE TEST PROCEDURE , 105 

three times its expected value. It is sometimes desirable to set a 
definite upper bound no for n, excluding even a small probability that 
n may exceed uq. This can be done by truncating the sequential proc- 
ess at n = no. That is to say, we terminate the process at n = n© 
even if the regular sequential rule does not lead to a final decision for 
^ ^ The following seems to be a reasonable rule for deciding 
acceptance or rejection of the lot at n = tiq if no decision is reached 
for n ^ no with the regular sequential procedure: If dn© ^ 
(<^o “b ^no)/2 we reject the lot, and if dn^ < (a^j, -f- r„^,)/2 we accept 
the lot. 

Truncation and its effect on the OC curve are discussed in Section 
3.8. If no is put as high as three times the expected value of n, the 
effect of truncation on the OC curve is negligibly small, since the 
probability is nearly 1 that the regular sequential procedure will termi- 
nate for n < no- 



Chapter 6, TESTING THE DIFFERENCE BETWEEN THE 
MEANS OF TWO BINOMIAL DISTRIBUTIONS (DOUBLE 

DICHOTOMIES) 

6.1 Formtilation of the Problem 

Suppose that we want to compare the effectiveness of two produc- 
tion processes where the effectiveness of a production process is meas- 
ured in terms of the proportion of effective units in the sequence pro- 
duced. We shall say that a unit is effective if it has a certain desirable 
property, for example, if it withstands a certain strain. Let p\ be the 
proportion of effectives if process 1 is used, and P2 the proportion of 
effectives if process 2 is used. In other words, pi is the probability 
that a unit produced will be effective if process 1 is used, and p2 is the 
probability that a unit produced will be effective if process 2 is used. 
Suppose that the manufacturer does not know the values of p\ and 
P2» s-iid that process 1 is in operation. If p\ ^ P2» the manufacturer 
wants to retain process 1 . However, if p\ < p2, especially if p\ is 
substantially smaller than p2. the manufacturer would like to replace 
process 1 by process 2 . Thus, we are interested in testing the hypoth- 
esis that Pi ^ P2 against the alternative that pi < p2. 

A more general formulation of the problem can be stated as follows. 
Consider two binomial distributions. Let pi be the probability of a 
success in a single trial according to the first binomial distribution, 
and let p2 be the probability of a success in a single trial according to 
the second binomial distribution. We shall use the symbol 1 for suc- 
cess and the symbol 0 for failure. Suppose that the probabilities pi 
and p2 are unknown. We consider the problem of testing the hypoth- 
esis that Pi ^ P2 on the basis of a sample consisting of Ni observations 
from the first binomial distribution and N2 observations from the 
second binomial distribution. Since in many experiments the case 
ATi = N2 is mainly of interest, and since this case (as we shall see 
later) makes an exact and simplified mathematical treatment of the 
problem possible, in what follows we shall assume that A^i = N2 = ^ 
(say). Thus, on the basis of the outcome of the two series of N inde- 
pendent trials we have to decide whether the hypothesis pi ^ p2 
should be accepted or rejected. 


106 



AN EXACT NON-SEQUENTIAL METHOD 


107 


6.2 The Classical Method 

The classical solution of the problem for large N is given as follows. 
Let Si be the number of successes in the first set of JV trials (drawn 
from the first binomial distribution), and let *S >2 be the number of suc- 
cesses in the second set of N' trials (dra\vn from the second binomial 
population). Denote (*Si + S2)/2N by p and 1 — p by g. Then for 
large N the expression 

( 6 : 1 ) 

is normally distributed with zero mean and unit variance if pi = P 2 - 
Suppose that the level of significance we wish to choose is <x. Let Xa 
be the value for which the probability that a normal variate with zero 
mean and unit variance will exceed Xa is equal to a. (For example, if 
a = .05, Xe, = 1.64.) Thus, if pi = p 2 > the probability that the ex- 
pression (6:1) will exceed X^ is equal to a. If pi > p 2 , the probability 
that the expression (6:1) will exceed Xa is less than a. According to 
the classical method, the hypothesis that pi ^ p 2 is rejected if the 
observed value of (6:1) exceeds X^. This method involves an approxi- 
mation, since the distribution of (6:1) is not exactly normal (for small 
N it is far from normal). For small N an exact method has been pro- 
posed by R. A. Fisher which, however, involves cumbersome calcula- 
tions. In Section 6.3 we shall suggest another (non-sequential) 
method which is exact and is fairly simple to apply as far as compu- 
tations are concerned. The latter method has the further advantage 
of being suitable for sequential analysis, to which existing methods 
are not readily adaptable. 

6.3 An Exact Non-Sequential Method 

Let Qi, • • • , a,v be the results in the first set of N trials, and bi, • • • , 
the results in the .second set of N' trials. These re.sults are arranged in 
the order ob.served. Consider the sequence of N pairs: 

(6:2) (^ai, bi), • • - , (a a-, 6,v) 

Let q be the number of pains (1,0) and ^2 the number of pairs (0, 1) 
in this sequence. We consider only the pairs (0, 1) and (1, 0) and base 
the test on them. 

Let a be the outcome of an observation from the first population, 
and b the outcome of an observation from the .second population. 
The probability that (a, b) = (1, 0) i.s e(iual to pi(l — pa), and the 
probability that (a, 6) = (0, 1) is equal to (I — Pi)p 2 - Hence, know 


^2 — Si 

'\/2Nvq 



108 


DOUBLE DICHOTOMIES 


ing that (a, 6) is equal to one of the pairs (0, 1) and (1, 0), the (condi- 
tional) probability that it is equal to (0, 1) is given by 


(1 — pi)p2 


pi(l — Pa) + ViiX — Pi) 

and the (conditional) probability that it is equal to (1, 0) is given by 


( 6 : 4 ) 


1 


Pi(l — Pa) 

— p = 

Pl(l — P2) + (1 “ Pl)p2 


Hence, when only the pairs ( 1 , 0 ) and ( 0 , 1 ) are considered, the variate 
<2 is distributed like the number of successes in a sequence of ^ = 
ti H- ^2 independent trials, the probability of a success in a single trial 
being equal to p. One can easily verify that p = M if Pi — p2» 
p < if pi > p2y and p > 3 ^ if pi < p2- Thus, the hypothesis to 
be tested, i.e., the hypothesis that pi ^ pa, is equivalent to the hypoth- 
esis that p ^ 3 ^. Thus, we can test the hypothesis that pi ^ p2 hy 
testing the hypothesis that p ^ 3^ on the basis of the observed value 
of <2- Since the distribution of <2 is the same as the distribution of the 
number of successes in ^ = q -h ^2 independent trials (t is treated as a 
constant and the probability of a success in a single trial is equal to 
p), the test procedure can be carried out in the usual manner. If we 
want a level of significance a, a critical value T is chosen so that for 
p = 14 the probability that ^ T is equal to of. The hypothesis that 
p ^ 3^ is rejected if and only if the observed ^2 is greater than or equal 
to the critical value T. The value of T can be obtained from a table 
of the binomial distribution. If t is large, <2 is nearly normally dis- 
tributed, and the critical value T can be obtained from a table of the 
normal distribution. 

This procedure thus provides a simple test of the hypothesis that 
Pi ^ p2* The question arises whether the efficiency of this method is 
as high as that of the classical method. It would seem that the method 
suggested here cannot be a most efficient procedure, since the values 
of (i and i2 depend on the order of the elements in the sequences 
(ai, • ■ - , au) and (hi, - • and there is no particular reason to 

arrange them in the order observed. However, it has been shown ' 
that the loss in efficiency as compared with the classical method is 
negligible if the number of trials is large.^ 

1 See the author’s report. Sequential Analysis of Statistical Data: TAeory, sub- 
mitted to the Applied Mathematics Panel, National Defense Research Committee, 

Sept., 1943. ,, 

2 The author believes that the loss in efficiency is slight even when N is smau, 

although no exact investigation of this case has been made. 



109 


SEQUENTIAL TEST OF THE HYPOTHESIS THAT px ^ p 2 

It should be pointed out that the procedure for testing the hypoth- 
esis that Pi ^ p 2 can be used also for testing the hypothesis that 
Pi = P 2 if the alternative hypotheses are restricted to P 2 > Pi- 

In addition to simplicity and exactness, the present method seems 
superior to the classical one in the following respect. Suppose that 
(contrary to the original assumption) the probability of a success varies 
from trial to trial. Let pi^** denote the probability of success in the 
I'th trial of the first set, and let P 2 ^*^ denote the probability of success 
in the zth trial in the second set (i = 1, - • A^). Assume that the 

probabilities pi^‘^ and p 2 ^‘^ are entirely unkno^vn and we wish to test 
the hypothesis that pi<‘^ — Pa^*^ = . . . = == 0. In this 

case the classical method is not applicable, but the present method 
provides a correct procedure. Such a situation may arise, for instance, 
if we want to test the hypothesis that the probability of a success 
(hitting the target) is the same for two different guns. In the course 
of the experiments the probability of a hit may change because of ex- 
ternal conditions such as wind or disposition of the gunner. However, 
these external conditions are likely to affect both guns equally if the 
trials are made alternately (or approximately alternately), so that if 
the two guns are equally good we have p/*^ = P 2 ^*^ (^ = 1, ■ • • 

6.4 Sequential Test of the H 3 rpothesis That pi ^ p., 

6.4.1 Risks That We Are Willing to Tolerate of Making Wrong 
Decisions 

In order to devise a proper sequential te.st for te.sting the hypothesis 
that pi ^ p 2 , we have to state first what risks of making wrong deci- 
sions we are willing to tolerate. The efficiency of production process 1 
may be measured by the ratio of effectives to ineffeetives produced, 
i.e., by kt = pi/(l — pi). Production proce.<s 1 may bo regarded the 
more efficient the larger the value of ki. Similai ly, the efficiency of 
production process 2 may be measured by = 7 ^ 2 /(l — P 2 ). The 
relative superiority of production process 2 over process 1 can ther 
reasonably be measured by the ratio of A '2 to ki, i.e., by 


(G:5) 


^2 Pad — Pi) 
u = — = — 

^*1 Pi(l — Pa) 


If M - 1, the two proces.se.s are equally good. If u > 1, process 2 is 
.superior to process 1, and if u < 1, proce.'^s 1 is superior to process 2. 
Thus, the manufacturer will, in general, be able to select two values 
of u, icq and Ui, say (uq < i/j), such that the rejection of process 1 in 



110 


DOUBLE DICHOTOMIES 


favor of process 2 is considered an error of practical importance when- 
ever the true value of w ^ Q-nd the maintenance of process 1 is con- 
sidered an error of practical importance whenever u ^ Ui. If u lies 
between Uq and ui, the manufacturer does not care particularly which 
decision is made. 

Clearly, we will always have t^o < Wi- If the transition from pro- 
duction process 1 to process 2 involves some cost or other inconven- 
iences, it seems reasonable to put = 1 (or icq may even be slightly 
greater than 1). This choice of Uo really means that we consider the 
rejection of process 1 a serious error whenever this process is not infe- 
rior to process 2. On the other hand, if the transition from process 1 
to process 2 does not involve any inconveniences, the rejection of proc- 
ess 1 in favor of 2 cannot be a serious error when the two processes are 
equally efficient, i.e., when u = 1. Thus, in such a case it seems reason- 
able to choose zio somewhat below 1. 

After the quantities Uq and Ui have been chosen the risks that we 
are willing to tolerate may reasonably be expressed in the following 
form: The probability of rejecting process 1 should not exceed a pre- 
assigned value a whenever u ^ Uo, and the probability of maintaining 
process 1 should not exceed a preassigned value ^ whenever u ^ Wi- 
Thus, the risks that we are willing to tolerate are characterized by the 
four quantities uq, Ui, oc, and 


6.4.2 The Sequential ProbabiUty Ratio Test Corresponding to the 
Quantities Uq, Ui, a, and p 

After the four quantities uq, Ui, or, and 0 have been chosen, a proper 
sequential test can be carried out as follows. The (conditional) prob- 
ability that we obtain a pair (0, 1), as given in (6:3), can be expressed 
as a function of u. In fact 

(1 — pi')p2 


(6:6) p = 


(1 — Pl)P2 


Plil ~ P2) 


U 


P 


i(l — P 2 ) H- P 2 (l — Pi) 


1 + 


P2(l — Pi) 1 + W 
Pl(l — P 2 ) 


Let Ho denote the hypothesis that p = Mo/( 1 + “o), and Hi the 
hypothesis that p = Ui/(I + Wi). A proper sequential test satisfying 
our requirements concerning tolerated risks is the sequential prob- 
ability ratio test of Ho against Hi, The acceptance and rejection num- 
bers for this sequential test can be obtained from (5:11) and (5:12) by 

substituting tio/(I "h ■“<>) Po» Wi/(1 “h ^ 1 ) I®** Pi* nnd i — h d" ^2 

for m. 


SEQUENTIAL TEST OF THE HYPOTHESIS THAT pi ^ pa 111 
Thus, for each value of t the acceptance number is given by 


log 




(6:7) 


at = 


1 — a 


log 


+ t 


± -r 1*1 


1 H- Wo 


log Ui — log ^*o log Ui — log Uo 


and the rejection number is given by 


log 


1 “ y3 


log 


( 6 : 8 ) 


a 


Tt = 


+ t 


1 + Wi 

1 4- Wo 


log Ui — log Wo log Wi — log Uo 


These acceptance numbers at and rejection numbers Vt {t = 1, 2, • • •) 
are best tabulated before experimentation starts. The sequential test 
is then carried out as follows. The observations are taken in pairs 
where each pair consists of an observation from the first process and 
an observation from the second process. We continue taking pairs as 
long as a< < ^2 < Tt. The first time that <2 does not lie between the 
acceptance and rejection numbers, experimentation is terminated. 
Process 1 is maintained if at this final stage t 2 ^ and process 1 is 
rejected in favor of 2 if <2 ^ Tt. 

As an illustration, the following example is given. Let uo = 13, 
Wi = 3, a = .03, and ^ = .10. The observed pairs (0, 1) and (1, 0) 
in an experiment, and the rejection and acceptance numbers, are given 
in Table 6. In this example, the sampling process is terminated at 
^ = 18 with the retention of process 1. 

The test procedure can also be carried out graphically as shown in 
Fig. 14. The total number ^ of pairs (0, 1) and (1, 0) is measured along 



DOUBI/E DICHOTOMIES 


112 

the horizontal axis. The points {ty at) will lie on a straight line Lq, 
since is a linear function of t. The points (ty r^) will lie on a parallel 
line Z/i. We draw the lines Lq and Li and plot the points (ty as 


TABLE 6 


t 

Nvimber 
of Pairs 
(0, 1). (1, 0) 
Observed 

Pairs 

(0, 1), (1, 0) 
Observed 

i 

at 

Accept- 

ance 

Number 

tz 

Number 
of Pairs 
(0, 1) 
Observed 

Tt 

Rejection 

Number 

1 

(0, 1) 

• • 

1 


2 

(0. 1) 

• • 

2 


3 

(1, 0) 

* • 

2 


4 

(1, 0) 

• • 

2 


5 

(1, 0) 

0 

2 1 


6 

(0, 1) 

1 

3 


7 

(1, 0) 

1 

3 


8 

(0. 1) 

2 

4 


9 

(0, 1) 

3 

5 


10 

<1, 0) 

3 

5 


11 

(0. 1) 

4 

6 


12 

(0, 1) 

5 

7 


13 

(0, 1) 

5 

8 

13 

14 

{1, 0) 

6 

8 

14 

15 

(1. 0) 

7 

8 

14 

16 

(0, 1) 

7 

9 

15 

17 

(1, 0) 

8 

9 

16 

18 

(b 0) 

9 

9 

16 

19 


9 


17 

20 


10 


18 

21 


11 


18 

22 


11 


19 

23 


12 


20 

24 


13 


20 

25 


13 


21 

26 


14 


22 

27 


15 


22 

28 


15 


23 

29 


16 


24 


experimentation goes on. The first time that the point (t, t^) is not 
within the lines Lq and Li experimentation is terminated. Process 1 
is maintained if at the final stage {ty t^) lies on Lo or below, and proc- 
ess 1 is rejected if {ty t^) lies on Li or above. 


SEQUENTIAL TEST OF THE HYPOTHESIS THAT pi S ps 


113 


The intercept of line Lo is given by 


log 




(6:9) 


ho = 


1 - a 


log Ui — log tiQ 
and the intercept of L\ is given by 

1 - p 


log 


( 6 : 10 ) 


hi = 


<x 


log Ux — log tto 
The common slope of the two lines is equal to 


log 


1 Ui 

1 + Wo 


(6:11) s = 

log Wi “ log Wo 

6.4.3 The Operating Characteristic Curve of the Test 

For any value w of the ratio k^/kx^ we shall denote by L(w) the 
probability of maintaining process 1. Clearly, Z/(u) is a function of w. 
This function Z/(w) is called the operating characteristic function of the 
test. It can be obtained from equations (5:19) and (5:20) by substi- 
tuting Uo/{l H- Wo) for pQ and Wi/(1 + wi) for pi. These equations 
are: ® 

(^0‘- ‘ 


( 6 : 12 ) 


L{u) = 


and 


(6:13) 


(r4,) 

1 _ (uiisy 

\1 + W,/ 


1 -h w / wi(l -f- i/o) \^ / 1 -f- Wo 


w 


\wo(l + Wi)/ 

Equation (6:13) can be written as 


/I + Wo \ 
\1 + Ux) 


(6:14) 


_ Q + 


w = 


“h tzi/ 


/ m(i + t/o) Y 

NWo(l + U\)) 


formulae given in SRG 255, p. 3.38, the quantities u and L(u) are ex 
in rms of a ^‘dummy'* variable x which is functionally related to h. 



114 


IXIUBLE DICHOTOMIES 


For any given value h we compute u and L{u) from the equations 
(6:12) and (6:14). The point [u, Liu)] obtained in this way will be a 
point of the OC curve. By calculating the points [u, Liu)] for a suffi- 
ciently large number of values of hj the OC curve can be drawn. 

We shall compute [Uf Liu)] for h= — —1, 0, 1, +«. Since 

1 “f” Uq ■Ui(1 Uq) 

< 1 and > 1, we obtain from (6:12) and (6:14) 

1 -I- Wi Wo(l + Ui) 

(6:15) w = oo and Liu) = 0 when h = — «> 

(6:16) u = 0 and Liu) — 1 when ^ = + oo 

Furthermore we obtain 

(6:17) u = Ui and Liu) — /8 when h = — 1 

and 

(6:18) u = Uq and Liu) = 1 — a when A = +1 

For ^ = 0, the expressions u and Liu) have the form 0/0. The 

limiting values of u and Liu) when h — ► 0 can be obtained by differen- 

tiating numerator and denominator &t k = 0. Then we have 


(6:19) u 


, 1 4- 

log— 

1 H- uo 


w,(l + Uo) 
log 

uo(l 4- Ui) 


and 


Liu) 



when ^ = 0. 

These five points on the OC curve already determine roughly the 
shape of the curve. It can be seen that w is a decreasing function of 
h and Liu) is an increasing function of h. Hence Liu) is a decreasing 
function of u. As u varies from 0 to ii<j> Liu) decreases from 1 to 
1 — a. In the interval from uq to Ui, Liu) decreases from 1 — a to 
0, and as u varies from Ui to 4-®®, the OC function Liu) decreases 
from 0 to 0. 


6.4.4 The Average Amount of Inspection Required by the Test 

For any value u of the ratio k 2 /ki, let Euit) denote the expected 
value of the total number of pairs (0, 1) and (1, 0) required by the 
test. The value of Euit) can be obtained from (5:23) by substituting 
Euit) for Spin), Liu) for Lip), uo/il 4- uq) for po, Ui/(1 4- ui) for pi. 
and u/il 4- u) for p. Thus ■* 

* The right-hand member of (6:20) can be expressed as a function of L(u), the 
intercepts and the slope of the decision lines. See SRG 255, p. 3.41. 



SEQUENTIAL TEST OF THE HYPOTHESIS THAT pi ^ P2 


116 


(6:20) EM = 


Uu) log — ^ 1- (1 - Liu)) log 

1 — oc a 


u 


log 


lil(l + Ito) 


log 


1 +Wo 


1 + W Wo(l + Wi) 1 + W 1 -f- 

To compute the expected value of the total number of pairs (in- 
cluding also the pairs (0, 0) and (1, 1)), we merely have to divide the 
right-hand side of equation (6:20) by pi(l — P 2 ) + PaCl “ Pi)- 
Since 1/(0) = 1 and -L(«>) =0, we obtain from (6:20) 

log 


( 6 : 21 ) 


and 


EM = 


1 — 


CL 


log 


1 “h 

1 + Ui 

1 - /3 


when u = 0 


log 


( 6 : 22 ) 


EM = 


a 


log 


Wi(l “h ?^o) 


when u = oc> 


Uo(l + Ml) 

Since = 1 — a and L{ui) = /?, it follows from (6:20) that 


(1 — or) log 


/3 


(6:23) EM) = 


and 


(6:24) EM) = 


1 — cc 


-f- a log 


1 - /3 


oc 


Mo 


1 + Mo 


log 


/3 log 


Ml (1 4- Mo) 
Mo(l -h M|) 
0 


1 


log 


1 “h Mo 


1 — Ct 


+ (1 - /?) log 


1 + Mo 1 -f- Ml 

1 - /3 


a 


Ml 


log 


Mi(l + Wo) 


log 


1 4- Mo 


1 4- Ml Mo(l 4“ Ml) 1 4“ Ml 1 4- Ml 

In Section 5.5 we have computed the expected value of n when p 
i.s equal to the slope of the acceptance and rejection lines. This corre- 
•spond.s to the case when m/(1 4 - m) = s, i.e., u = s/(l — 6-), where the 
slope 6 - is given in (0:11). The value of Eu{t) for u = s/(l - 5 ) can 
be obtained from the right-hand member of (5:30), replacing pi by 
Mi/(1 4- Ml) and po by uo/(l 4 - mq). Thus 

- (^log — ) (log 

(0;25) B . (0 = L~ ^ “ y. 


I 


, Ml ( 1 4“ Mo) , 1 4" Ml 

log — log 


Mo(l 4- Ml) 


1 4 - ?/() 


116 


DOUBLE DICHOTOMIES 


The determination of the five values of as given in (6:21) 

through (6:25), may frequently suffice in practice, since these five 
points generally give a fairly good idea of the shape of the whole curve. 

6.4.6 Observations Taken in Groups 

In applications it may happen that, at each stage in the sequential 
process, instead of d^a^ving a single observation we draw a group of 
V observations from each of the binomial distributions. Hence, instead 
of a single pair, we have two groups of v observ’^ations. The effect of 
grouping on the OC and ASN curves has been discussed in Section 5.6 
and the results obtained there can be applied to the case under con- 
sideration here. If the order of observations in each group of v is re- 
corded, we can establish the number of pairs (0, 1) and the number of 
pairs (1, 0) for each pair of groups of v obseiwations. In such a case 
the test can be carried out as described in Section 6.4.2, since after 
each pair of groups of v observations we can compute t and ^ 2 - How- 
ever, if the order of observations in such groups is not recorded, the 
difficulty arises that we are not able to determine the values of i and 
<2 needed for the test procedure. 

It has been shown ® that in such a case we may replace t and by 
certain estimates of t and <2 without affecting seriously the probability 
of making an incorrect decision. The estimates of tx and <2 (and thereby 
also an estimate of < = arc obtained as follows. Let v\ be the 

number of successes in the group of v observations drawn from the first 
binomial distribution, and let 1^2 be the number of successes in the 
group of V observation.^ drawn from the second binomial distribution. 
Then for this pair of groups of v ob.servations we estimate the number 
of pairs (1, 0) to be Vy — {viv^/v) and the number of pairs (0, 1) to be 
V 2 — Thus, an e.stimate of tx is obtained by .summing 1^1 

— {^ 1 ^- 2 /^) over all pairs of groups observed, and that of (2 is obtained by 
summing V 2 — (t'l^' 2 /^) o^'el• all pairs of groujjs obseiwed. 

For the effect of grouping on the OC and ASN curv’es, the results 
of Section 5.6 can be applied, since the test procedure discussed here 
reduces to that con.si<lered in Section 5.6 when p = a/(l + u), 
m = tx -h h = ^ a-iid = ^2- 

* See the author’s report, Sequent lal Anatt/sis of Stadstical Data: Theory, sub- 
mitted to the Applied Mathematics Panel, National Defense Research Committee, 
Sept., 1943. 



Chapter 7, TESTING THAT THE MEAN OF A NORMAL DIS- 
fRIBUTION WITH KNOWN STANDARD DEVIATION FALLS 

SHORT OF A GIVEN VALUE 


7.1 Formulation of the Problem 

Let X be a random variable which is normally distributed with un- 
known mean d and known standard deviation a. In this section we 
shall deal with the problem of testing the hypothesis that 6 is less than 
or equal to some specified value . 

Such a problem arises frequently, for example, in quality control and 
acceptance inspection. Suppose that a lot consisting of a large number 
of units of a manufactured product is submitted for acceptance inspec- 
tion. The number of units in the lot is assumed to be sufficiently large 
so that the lot may be treated as containing infinitely many units. 
Suppose that the result of an observation is a measurement x of some 
quality characteristic of the unit, such as the weight, or hardness, or 
tensile strength. The value of x will, in general, vary from unit to 
unit. It is assumed that x is normally distributed with a kno\vn stand- 
ard deviation a but unknown mean d. Suppose, furthermore, that the 
product is considered the more desirable the smaller the value of 6. 
Then it will, in general, be possible to designate a particular value 6' 
such that we prefer to accept the lot if 0 < e' and we prefer to reject 
the lot if 0 > e\ Thus, in such a situation, we are interested in de- 
vising a sampling plan to test the hypothesis that d < d'. 

Since quality control and acceptance inspection is an important field 
of application for such test procedures, we shall continue the discus- 
sion using the terminology of acceptance inspection. This, of course, 
should not be interpreted as a restriction on the general validity and 
applicability of the test procedure. 


7.2 Tolerated Risks of Making Wrong Decision 

If 0 = o', we are indifferent whether the lot is accepted or rejected 
The preference for acceptance increases with decreasing value of 6 in 
the domain 0 < 0 ' , and the preference for rejection increases with in- 
creasing value of 0 in the domain 0 > Thus, it will be possible in 
general, to find two values 0^ and O, (0^ < 0' and 0^ > 0') such that 

117 



118 TESTING THAT THE MEAN IS BELOW A GIVEN VALUE 

rejection of the lot is considered an error of practical consequence if 
6 ^ and acceptance of the lot is considered an error of practical 
consequence if 0 ^ ; for values 6 between Oq and 6^ we do not care 

particularly which decision is taken. Using the terminology introduced 
in Section 2.3.1, we may say that the zone of preference for acceptance 
consists of all values 0 for which 6 ^ Bq, the zone of preference for re- 
jection is the set of all values $ for which d ^ 0i, and the zone of in- 
difference consists of all values 0 between 6q and 6i. 

After the two values Bq and 6i have been chosen the risks that we 
are willing to tolerate may reasonably be expressed as follows.' The 
probability of rejecting the lot should not exceed a small preassigned 
value at whenever B ^ 0o» and the probability of accepting the lot 
should not exceed a small preassigned value ^ whenever B 6^. Thus, 
the risks that we are willing to tolerate are characterized by the four 
numbers Bq, Bi, o:, and /3. 

7.3 The Sequential Probability Ratio Test Corresponding to the 
Quantities Bq, Bj, a, and p 

The requirements regarding the tolerated risks are satisfied by the 
sequential probability ratio test of strength (o:, /3) for testing the hy- 
pothesis that B = Bq against the alternative that & = By. This sequen- 
tial test is given as follows. Let Xi, X 2 , '*•» etc., be the successive 
observations on x. The probability density of the sample (xi, • • •, Xm) 
is given by 

(7:1) Pom ^ ^ — e 

(27r) Sa’” 

if <9 = Bo, and by 

m 

1 ~ 2^2 S 

(7:2) Pim = ^ 

(2ir) 2(7"^ 

if 0 = di. The probability ratio pim/po,n is computed at each stage 
of the inspection. Additional observations are taken as long as 

- 5L S(Xa-«l)* 

^ — i < ^ 



* See. for instance, Section 2.3.2. 



THE SEQUENTIAL PROBABILITY RATIO TEST 


119 


Inspection is terminated with the acceptance of the lot if 


(7:4) 





Inspection is terminated with the rejection of the lot if 


(7:5) 


2^* 




2 <F* 


According to Section 3.3 approximate values of A and B are given 
by (1 — 0)/oc and 0/{l — a), respectively. 

By taking logarithms and simplifying, the inequalities (7:3), (7:4), 
and (7 :5) can be written as 


m 


(7 :6) log 


0 $1 — Oq m 


< 


1 — a a 


m 

> (9o* - < log 

1 


1 - 0 


cc 


(7:7) 

and 

(7:8) 


0i - 9o ^ m 


(9o^ — 9i^) g log 


0 


a » i 
m 


2<r 


1 — 


a 


9i — Bo m 


(® 0 ^ - ^ 1 ^) ^ log 


1 - 0 


Ot “ 1 


a 


respectively. 

Further simplification in carrying out the test procedure can be 
achieved by adding (-m/2a^)ieo^ - to both sides of the inequal- 
ities (7:6), ^(7:7), and (7:8) and then dividing these inequalities by 

(^1 — Bo)/a^. These operations transform the inequalities (7:6), (7:7) 
and (7 :8) into ^ ’ 


(7:9) log + 7/. < 


di - eo 1 - 


CC 


m 


0-1 


1 — 

Xa < log f- 7 n 


^0 ”1“ B\ 


B\ — $0 


at 


(7:10) 

and 

(7:11) 

respectively 




01-^0 1 - 


1 0 , H” 

log ^ 1 - m 


oc 




1-/3 

log h m 


Bi — 


^0 + Bi 


oc 



120 TESTING THAT THE MEAN IS BELOW A GIVEN VALUE 


By using the inequalities (7:9), (7:10), and (7:11) the inspection 
plan may be carried out as follows. For each m compute the accept- 
ance number 

/3 Go 01 

(7:12) Om = log h rn 

01 — ^0 

and the rejection number 


1 — a 




1 — 0 ^0 “b 

log h m 


C€ 


These acceptance and rejection numbers are best computed before in- 
spection starts. Inspection is continued as long as Om 2xa < 

At the first time when Sxa does not lie between Om and inspection 
is terminated. The lot is accepted if ^ a™, and the lot is rejected 

if XXa ^ ’“m- 

As an illustration, consider the following example. Let $o = 135, 
01 = 150, « = .01, and y3 = .03. Furthermore, let a = 25. The o^ 
servations and the acceptance and rejection numbers are tabulated in 
Table 7, which shows that the sampling inspection is terminated at 
m — 20 with the acceptance of the lot. 

The test procedure can also be carried out graphically as shown in 
Fig. 15. The number m of observations is measured along the hori- 



m 


zontal axis. The points (m, am) will lie on a straight line Lq and the 
points (m, Vm) wUl lie on a parallel line Li. We draw the parallel lines 

Lo and L, before inspection starts. The points (m, ^^x) are plotted 

Ot^ 1 

as inspection goes on. Inspection is continued as long as the plotted 
points (m, Sx„) lie between the lines Lq and Li. Inspection is termi- 
nated at the first time when the point (m, Sx„) does not he between 



THE SEQUENTIAL PROBABILITY RATIO TEST 


121 


TABLE 7 


m 

Number of 
Observations 

Cttn 

Acceptance 

Number 

X 

Observed 

Value 

Cumulated 
Sum of 
Observed 
Values 

^7n 

Rejection 

Number 

1 

• ♦ • • 

151 

151 

334 

2 

139 

144 

295 

476 

3 

281 

121 

416 

619 

4 

424 

137 

553 

761 

5 

566 

138 

691 

904 

6 

709 

136 

827 

1046 

7 

851 

155 

982 

1189 

8 

994 

160 

1142 

1331 

9 

1136 

144 

1286 

1474 

10 

1279 

145 

1431 

1616 

11 

1421 

130 

1561 

1759 

12 

1564 

120 

1681 

1901 

13 

1706 

104 

1785 

2044 

14 

1849 

140 

1925 

2186 

15 

1991 

125 

2050 

2329 

16 

2134 

106 

2156 

2471 

17 

2276 

145 

2301 

2614 

18 

2419 

123 

2424 

2756 

19 

2561 

138 

2562 

2899 

20 

2704 

108 

2670 

3041 

21 

2846 

• * 4 


3184 

22 

2989 



3326 

23 

3131 


♦ • • ^ 

3469 

24 

3274 

* * 9 

» « • • 

3611 

25 

3416 

. . - 

1 « « 1 

3754 


Lo and L,. If it lies on Lq or below the lot is accepted, and if it lies 
on L\ or above the lot is rejected. 

The common slope of the lines Lq and Li is given by 


(7:14) s = 

2 

The intercept of Lq is equal to 
(7:15) fto = — ^log— ^ 


01 — 1 — a 

and the intercept of Li is given by 


(7:16) 


hi = 


0l 0Q 


log 


1 - 0 


oc 



122 TESTING THAT THE MEAN IS BELOW A GIVEN VALUE 


7.4 The Operating Characteristic (OC) Curve of the Test 

Let L{0) denote the probability that the sequential test will lead to 
the acceptance of the lot when Q is the true mean value. The function 
Lifi) is called the operating characteristic function of the test. Ap- 
proximate formulas for the OC function are derived in Section 3.4 and 
the general results are applied to testing the mean of a normal popu- 
lation. [See equation (3:48).] It is shown there that 


L{6) 


0\ Oo — 25 

h — 

— ^0 

It can be seen from (7:17) and (7:18) that L(5) is an increasing func- 
tion of h and is a decreasing function of 5. Hence L{e) is a decreas- 
ing function of 5. . 

For 5 = — «>, 5o, (00 + 0i)/2, ^ the values of L(5) obtamed 

from (7:17) are given as follows.^ 

(7:19) L(— <=o) — 1; M&o) = 1 — a 

log 

a 

1 - 0 ! ~ 
log log 

<X 1 — OC 

LOi) = ^ 

L(oo) = 0 

The computation of these five points of the OC curve will suffice m 
It may be of interest to express L{e) in terms of the intercepts ho 



(7:17) 

where 

(7:18) 



a For ^ = 0 i_±_£ 2 A = O and the limiting value of the right-hand member 


log 


1 - 0 


of (7.17) as A — * 0 is equal to 




AVERAGE AMOUNT OF INSPECTION 


123 


and h\ and the common slope s of the lines Lo and Li.^ 
and (7:18) it follows that 


From (7:17) 


( 7 : 20 ) 


L{d) 


fli+go-20 . 1-0 

g 9l—9o t 


9l-+-do-2«, \ — 0 

log 


g — ^0 




^ — 


1 — 


Since = 


log 




0 \_ 6 q 1 —• or 
we obtain from (7 :20) 


, hi — 


$1 — do 


1-0 

log and s 


^1 


ct 


(7:21) 


US) 




— 9)Ai — 

— e'^ 


7.6 The Average Amount of Inspection Required by the Test 

In Section 3.5 the following approximation formula is derived for 

the expected value £^e(n) of the number n of observations required by 
the sampling plan. 


L(e) log ^ h [1 - L(e)] log 


( 7 : 22 ) 
where 

(7:23) 


EeM = 


1 — 


a 


a 


Eeiz) 


2 = log — tt = log i- 


S{x, do) 


2«r2 


<* — tfo)* 


1 


^ ^ [ 2(^1 - do)x + do^ - di^] 


and Ee(z) denotes the expected value of z when d is the true mean of x. 
The value of Ee{z) is given in Section 3.5, equation (3:60). 


(7:24) 

Hence 


^ [2(01 - 0o)0 + 00^ - 0i^J 


(7:25) Eein) = 




1 - 

0O^ - 01 =^ -b 2(di - do)e 

+ L(d)iko — hi) 

d ~ S 


* See also SRG 255, p. 4.19. 


124 TESTING THAT THE MEAN IS BELOW A GIVEN VALUE 


where ho and hi are the intercepts and s is the common slope of the 
lines Lq and L/i. 

For 6 = s, the right-hand member of (7 :25) takes the form 0/0. It 
is shown in the Appendix, equation (A:99), that the limiting value is 
given by 

, iS , 1 -i3 

— log log 

1 — ct oc 


(7:26) 






Since E»(z) = 0, Ea(z^) is equal to the variance of z. From (7:23) 
it follows that the variance of z is equal to {di — 0o)^/<^^‘ Hence 


(7:27) 


E,(n) = 





Chapter 8. TESTING THAT THE STANDARD DEVIATION OF 
A NORMAL DISTRIBUTION DOES NOT EXCEED A GIVEN 

VALUE 

8.1 Formulation of the Problem 

Let a; be a normally distributed variate. In this section we shall 
deal with the problem of testing the hypothesis that the standard 
deviation <r of x does not exceed a given value <r'. There are two cases 
to be considered: the mean of x is kno^vn or unknown. First we shall 
treat the case when the mean of x is known. If the mean of x is un- 
known, only a slight modification of the test procedure will be neces- 
sary, as will be seen later. 

This problem, like the one treated in Section 7, arises frequently in 
quality control and acceptance inspection. Suppose that x is some 
measurable quality characteristic of a manufactured product and that 
X is normally distributed in the population of units produced. Sup- 
pose, furthermore, that the quality of the product is considered the 
better the smaller the standard deviation <r. Thus, there will be, in 
general, a value a' such that the product is considered substandard if 
O' > a' and the product is considered satisfactory (meets specification) 
^f O' ^ <T . Since <r is unknown, the problem is to devise a sampling 
plan for testing the hypothesis that the product is satisfactory, i.e., 

8.2 Tolerated Risks for Making a Wrong Decision 

If the quality of the product is exactly on the margin, i.e., if o- = a'. 
It will make no difference whether the product is classified as satis- 
factory or as substandard. However, if <7 is considerably smaller than 
tr , the classification of the product as substandard will usually be 
regarded as an error of practical importance. Similarly, if is much 
larger than a , the classification of the product as satisfactory will be 
a serious error. Thus, it will be possible to specify two values <ro and 
Oi(oo < o' and <ti > a') such that the classification of the product as 
substandard is considered an error of practical importance whenever 
^ *70, and the classification of the product as satisfactory is regarded 
as an error of practical consequence whenever <7 ^ < 71 ; for values <t be- 
tween o-o and <71 we do not care particularly which action is taken 

125 


126 TESTING THAT VARIANCE IS BELOW A GIVEN VALUE 


In accordance with the considerations in Section 2.3.2, the risks that 
we are willing to tolerate may reasonably be stated as follows: The 
probability of classifying the product as substandard should not exceed 
a small preassigned value a whenever <r ^ o-q, and the probability of 
classifying the product as satisfactory should not exceed a preassigned 
value /3 whenever o- ^ <7i . 


8.3 The Sequential Probability Ratio Test Corresponding to the 
Quantities cro> o-i, a, and p 

A sampling plan satisfying the requirements regarding the tolerated 
risks is given by the sequential probability ratio test of strength (a, P) 
for testing the hypothesis that <t = <tq against the alternative that 

<j = <ri. 

Let x\, X 2 , • • etc., denote the successive observations on x. The 
probability density of the sample (a:i, • • •, Xm) is given by 

m 

1 

(8:1) Pm = m e 

(27r)2o-’” 

where the value of the mean d is assumed to be known. Let de- 
note the expression we obtain if <t is replaced by <xi {i = 0, 1) in the 
right-hand member of (8:1). The sequential probability ratio test is 
given as follows. The probability ratio pim/Vom is computed at each 
stage of the experiment. Additional observations are taken as long as * 


( 8 : 2 ) 


0 ^ 

1 — O' Pom 


m 


1 - 2^=2 
e 


€k ** 1 




m 


m 

1 - 2^, S 

— e ^ 


< 




m 



The product is classified as satisfactory if 


(8:3) 




m 


m 


1 



» There is a slight approximation involved in the formulas given below, since 
the constants A and B are put equal to (1 — ^)/a and ^/(l — «) respective y. 
In this connection see Section 3.3. 



THE SEQUENTIAL PROBABILITY RATIO TEST 


127 


The product is classified as substandard if 


1 - 2^, S 

— e * 


(8:4) 




m 


1 - ^ 


1 


a 


a** 1 




m 


Taking logarithms, dividing by (l/2tro^) — (1/2^1®) and simplifying, 
the inequalities (8:2), (8:3), and (8:4) will become 


2 log 


0 


( 8 : 6 ) 


( 8 : 6 ) 


and 


(8:7) 


1 - a 


tn log 


2 

<s ^ 1 


a 


- 0)^ < 


o-o* <ri 


2 log 


id 


w log 


0-1 


a 


<^0 


(7© <ri 


m 




0 <71 

2 log : f- m log — 


1 - a 


<^o 


at “» 1 


2 ^2 

<7© <7i 


m 

y^(^a - e)^ ^ 


1 — |d «7i 

2 log h m log — 


<x 




a ■■ 1 


^2 ^2 


respectively. 

On the basis of the inequalities (8:5), (8:6), and (8:7), the test pro- 
cedure can be carried out as follows: For each integral value m com- 
pute the acceptance number 


2 log 


0 


1 — o: 


log 


<r\ 




<^0 


47© <7i 


tJQ^ ( 71 ^ 


( 8 : 8 ) 


-f- m 


128 TESTING THAT VARIANCE IS BELOW A GIVEN VALUE 


and the rejection number 


2 log 


0 


log 




(8:9) 


a 




<ro 


m 


<Tq 




o-o 


<^1 


These acceptance and rejection numbers do not depend on the 
outcome of the observations and, therefore, they can be computed 
before inspection starts. Inspection is continued as long as Om < 


m 


'y — ff)^ < Tjn- The first time that — Q')^ does not lie be- 


a ^ 1 


tween am and inspection is terminated. If at the final stage 


TTt 



(Xa — ^ Om. the product is declared satisfactory, and if 


Cl » 1 
m 


— 0)2 ^ Tm the product is declared substandard. 


0*1 


A graphical presentation of the test procedure is shown in Fig. 16. 



The number m of observations is measured along the horizontal axis. 
Since both am and Vm are linear functions of m, the points (m, Om) will 
lie on a straight line Lq and the points (m, rm) will lie on a straight 
line Li . These two lines are parallel and the common slope is given by 


log 


<ri 


ero 


s = 


( 8 : 10 ) 


1 


1 



OPERATING CHARACTERISTIC FUNCTION OF THE TEST 129 


The intercept of Lq is equal to 


2 log 




( 8 : 11 ) 


ho — 


I — ot 


-^2 ^2 

O-Q CTl 


and the intercept of L\ is given by 


2 log 


1 - ^ 


( 8 : 12 ) 


= 


a 


2 „ 2 

<To <Ti 


The lines Lq and Li can be dra\vn before inspection starts. As inspec- 

m 

tion goes on the points [m, ~ d)^] are plotted. The first time 

0( — 1 

that the point [m, — 0)^] does not lie between the lines Lq and 

Li, inspection is terminated. If the point [m, 2(Xa — lies on Lq 
or below, the hypothesis that the product is satisfactory is accepted; 
and if the point [m, — B)^] lies on Z^i or above, the product is 

declared substandard. 


8.4 The Operating Characteristic (OC) Function of the Test 

For any value <t, let L(a) denote the probability that the test will 
terminate with the acceptance of the hypothesis that the product is 

satisfactory. The function Z/(cr) is called the operating characteristic 
function of the test. 

In Section 3.4 a general method is given for deriving an approxima- 
tion formula for the OC function for any sequential probability ratio 
test. Applying the result of that section, we obtain 


(8:13) 


L(a) = 




i—y- 


where k is the root of the equation 
(8:14) 1 


i_ r 

\/ 27r<T 






1 I ^ 

(x-e)^ 








130 TESTING THAT VARIANCE IS BELOW A GIVEN VALUE 


It can be seen that the integral on the left side of (8:14) has a finite 
value only if — (h/<To^) -h (l/tr^) > 0. In this case, as can be 

verified, we have 


(8:15) 





dx — 



Hence equation (8:14) can be written as 
(8:16) = 



h 1 

- 2 _2 

<rn <r 


Instead of solving (8:16) with respect to h, we shall solve it with re- 
spect to O'. We obtain 

(8:17) == 


With the use of equations (8:13) and (8:17), the OC curve can be 
plotted as follows. For any given value of h we compute <r and L{<r) 
from equations (8:13) and (8:17). The pair [tr, ■£'(o‘)] obtained in this 
way gives us a point on the OC curve. Computing [a, for a 

sufficiently large number of values of h, we obtain enough points to 
draw the OC curve. 

For computational purposes, it may be convenient to put ® 



(8:18) 


h h 

s= < or h = 

20-1^ 2<ro^ 



Then equations (8:13) and (8:17) can be ^vritten as 


(8:19) 


L(cr) - 


0 - 

\» 0 * <ri*/ _ 1 


(‘- ^0 (-n) (- 1^-) (n) 

\<ro* <»l*/ e ' 



1 



» A similar simplification was made by the Statistical Research Group. See 
SRG 255, p. 6.31. The parameter I used there corresponds to —t here. 


AVERAGE AMOUNT OF INSPECTION 


131 


and 


( 8 : 20 ) 

where 5 is the common slope and and hi are the intercepts of the 
lines Lq and Li, Equations (8:19) and (8:20) may be more convenient 
for the computation of the OC curve than the original equations (8:13) 
and (8:17). 

For <r = 0, <To, y/Sf <xi, + oo the values of are given as follows: 

( 8 : 21 ) L{0) = 1 

Z/(o-q) = 1 — Oi 

— ^0 

L{ai) = 0 
L{co) = 0 

These five points already determine roughly the shape of the OC curve 

and in many instances it will not be necessary to compute further 
points. 

8.6 The Average Amount of Inspection Required by the Test 

According to the results in Section 3.5, an approximation formula 

for the expected value E^{n) of the number n of observations required 
by the sampling plan is given by 



( 8 : 22 ) 

where 

(8:23) 



1 -57:2 


2 = log 


— e 
<r\ 


1 <^ 0,1 

i = log 


1 - 




132 TESTING THAT VARIANCE IS BELOW A GIVEN VALUE 


and E„ {z) denotes the expected value of z when <r is the standard devi- 
ation of X. We have 


(8 :24) 


E,(z) = - 


(8 :22) we obtain * 


(8:25) E^{n) = 


1 

1 



1 

1 


0-1^ 

right-han' 


/3 


— cc 

1/ 

< 1 

2 ' 

VCTO^ 




<^1 




a 


] 


+ log 


1 -/S 


oc 


log 




<ro 




LMiho - hi) -I- hi 


~ s 


For a = v/s the expected value of 2 : is equal to 0 and the right-hand 
member of (8:25) takes the form 0/0. According to equation (A:99) 
in the Appendix, the limiting value is given by 


— log 


/9 . 1 - ^ 

log 


(8:26) 


E^»{n) = 


ot 


oc 




Since E^^iz) = 0, E^{z^) is equal to the variance of z when <r »= 
-y/i- It follows easily from (8:23) that this variance is equal to 


— log 


/3 , 1-/3 

log 


(8:27) 




I — <x 


Ct 


if---) 

2 \<ro^ <Ti^/ 


-h^i 

2^ 


3 The expression of E^in) in terms of the slope and intercepts of the decision 
lines is contained also in SRG 265. p- 6.34. 


MODIFICATION OF PROCEDURE WHEN MEAN UNKNOWN 133 


8,6 Modification of the Test Procedure When the Population Mean 
Is Not Known * 

If the mean 0 of x is not known, the following two modifications of 

m 

the test procedure are to be made: (1) replace — e)^ by 

a » 1 
m 

^ ^ (Xg — x) where x = (xi + ■ - • + Xm)/m; (2) the acceptance num- 

Cl “ 1 

ber Oyn is replaced by Om—i and the rejection number Vm is replaced 
by Thus, if the mean is unknown, the acceptance and rejection 

numbers at the mth trial are equal to the acceptance and rejection 
numbers corresponding to the (m ~ l)th trial when the mean is known. 

The formula for the OC curve remains unchanged and the expected 
value of the number of observations required by the test is larger by 1 
when the mean is unknown than when the mean is known. 

* The result contained in this section was found by C. Stein and M. A. Girshick, 
independently of each other. The proof is based on a transformation of the observa- 
tions which reduces this case to the case when the mean is known. See Girshick ’s 
paper, “Contribution to the Theory of Sequential Analysis,” The AriTials of Mathe^ 
malical Stalietics, June, 1946. 


Chapter 9. TESTING THAT THE MEAN OF A NORMAL DIS- 
TRIBUTION WITH KNOWN VARIANCE IS EQUAL TO A 

SPECIFIED VALUE 


9.1 Formulation of the Problem 

Let a: be a quality characteristic of a product, such as weight, diam- 
eter, or hardness. Suppose that x is normally distributed in the popu- 
lation of all units produced and that the standard deviation <r of a: is 
kno^\'n but the mean 0 of a: is unknowm. Suppose, furthermore, that 
a particular value of 6, say Oq, is considered the most desirable value 
for the product. In general, the greater the absolute deviation of the 
tme value d from the most desirable value &o, the less satisfactory the 
product. Since the manufacturer would like to achieve and maintain 
the value do of 6 as closely as possible, he will be interested in testing 
the hypothesis that 0 = 0o. If the evidence supplied by a sample 
should indicate that 6 do, he will tr>^ to improve the production proc- 
ess. Of course, if d ^ do but is near there is no particular need to 
improve the production, and acceptance of the hypothesis that d = do 
would not be a serious error. However, there will be, in general, a 
positive value 5 such that the acceptance of the hypothesis that d ^ do 

is regarded as an error of practical importance whenever 

* 

The situation described in the preceding paragraph will thus lead 
to the following problem: A sampling plan is to be devised for which 
the probability that the hypothesis that d = do will be rejected (the 
product will be declared substandard) does not exceed a small pro- 
assigned value a when 0 = 0o. and the probability of accepting the 
hypothesis that d = do (declaring the product satisfactory) does not 

, 1 0 — 00 

exceed a small preassigned value /3 whenever ^ o. 


S. 


9.2 A Sequential Sampling Plan Satisfying the Imposed Require- 
ments 

It has been shown in Section 4.1.4 that an adequate sampling plan 
for the problem described in Section 9.1 is given as follows. Compute 
the ratio 


134 



A SEQUENTIAL SAMPLING PLAN 


135 


(9:1) 


Plm 

POm 


1 e 


(*o — «0 — ««^)* 


1 


-h e 




- jT. 2 

e “= * 


at each stage of the experiment. Continue taking observations as long 
as 

(9:2) B < — < A 

POm 

Accept the hypothesis that the product is satisfactory if 


(9:3) 


POm 


Reject the hypothesis that the product is satisfactory if 


(9:4) — & X 

POm 

To satisfy the requirements imposed regarding the probabilities of 
making wrong decisions, for all practical purposes we may put A = 
(1 — P)/ct and B — /3/(l — a). 

The expression for p\m/pom given in (9:1) can be simplified to 


(9:5) 


Plm 

POm 


2 


= cosh 




Substituting this value of pim/Pom in (9:2), (9:3), and (9:4) and taking 
logarithms, we find that these inequalities become 


f 2 — IIL* “1 

(9:6) log S + 7n — < log cosh 2^(x„ — 0o) J < log A 


m 


(9:7)- 

and 

(9:8) 


log cosh — S(x 






log cosh -2(x« — 


- ffo)] g 

9o)] a 


log B -\- m — 

2 


^ log A -j- m — 

2 


136 TESTING THAT THE MEAN IS EQUAL TO A GIVEN VALUE 


With the use of inequalities (9:6), (9:7), and (9:8), the test proce- 
dure is carried out as follows. At each stage of the experiment we 


compute = log cosh T- ^ — ^o)l ■ 

a = 1 


The first time that Z 


m 


does not lie between log B [m(3^/2)] and log A + [m(5V2)] we ter- 
minate the process. The hypothesis that 0 = is accepted if Zm = 
log B + [m(5^/2)], and rejected if Z^ ^ log A + [m(6V2)]. 

The computation of Zm at each stage of the experiment is somewhat 


cumbersome. However, if 


— 'Z{Xa — do) 

<T 


is greater than 3, Zm = 


log cosh 


- S (x« — ^o) 


is very nearly equal to 


- S(Xa - 6o) 
<r 


9 

log2.‘ When this approximation to Zm is used, inequalities (9:6), 
(9:7), and (9:8) simplify to 


(9 :9) - (log .S + log 2) H- w — < I S(x« - 0o) | < 

5 2 ^ 

- (log A + log 2.) + m — 
& 2 

(9:10) I 2(x„ — 

and 

(9:11) 1 2(:c - 

respectively. For all practical purposes inequalities (9:9), (9:10), and 
(9:11) may be used instead of (9:6), (9:7), and (9:8) whenever 

- 1 S(x„ - So) I a 3. 

<T 

The following is an alternative computational procedure which may 
be found useful. Consider the equation in u. 


do) ^ 


<r 0^0 

- (log B + log 2) rn — 
5 2 


^o) ^ 


- (log A + log 2) -h m — 
5 2 


( 9 : 12 ) 


log cosh u = V 


This has exactly one positive solution if v ^ 0. The root of this equa- 
tion is given by 

(9;13) I W I = ^(^) = (^'’ "b — 1 ) 

^ See also SRG 255, p. B. 15. 



A SEQUENTIAL SAMPLING PLAN 


137 


The function can easily be tabulated. In terms of the function 
inequalities (9:6), (9:7), and (9:8) can be written as 


(9:14) 

(9:15) 




- Oo) < 


( 


log B m 


i*(' 


m 



and 

(9:16) 


I S(x« — 0o) I = A m — ^ 


When inequalities (9:14), (9:15), and (9:16) are used, the test can 
be carried out as follows. For each integral value m we compute the 
acceptance number 

(9 :17) an = -<t> ^log B m — ^ 

and the rejection number 

(9:18) ^ 5 A m — ^ 

These acceptance and rejection numbers can be computed before ex- 
perimentation starts. Additional observations are taken as long as 
ttn < I S(Xa — ^o) I < r,„. If I S(Xa — ^o) | ^ the hypothesis that 
0 = 00 is accepted and if [ S(xa — 0o) | ^ the hypothesis that 6 — $o 
is rejected. 


PART IlL THE PROBLEM OF MULTI-VALUED DECISIONS 

AND ESTIMATION 


Chapter 10. THE CHOICE OF A HYPOTHESIS FROM A SET 
OF MUTUALLY EXCLUSIVE HYPOTHESES (MULTI-VALUED 

DECISION) 

10.1 Formulation of the Problem 

Part I has been devoted exclusively to the discussion of the problem 
of testing a statistical hypothesis. In such problems only one of two 
possible decisions can be made: the hypothesis is either rejected or 
accepted. Thus, we can say that testing a hypothesis is a two-valued 
decision problem, since the decision can take only the two values: 
acceptance and rejection. Let H denote the negation of the hypothesis 
H to be tested. Then testing the hypothesis H is the same as choosing 
between the two competing hypotheses H and H. 

It has been pointed out in Section 1.3.5 that testing a hypothesis H 
arises frequently as a consequence of the problem of deciding between 
two alternative courses of action, say action 1 and action 2. Suppose 
that the preference for one or the other action depends on the value 
of an unknown parameter 6 of the distribution of a random variable x. 
Let denote the set of all values of B for which action 1 is preferred to 
action 2 (or at least not less desirable than action 2). If a decision is 
to be made on the basis of a finite number of observations on a:, this 
leads to the problem of testing the hypothesis H that the true value B 
lies in a>. If H is accepted, we decide for action 1, and if is rejected 
we decide for action 2. In applications it happens frequently that there 
are more than two alternative courses of action, one of which is to be 
chosen. Suppose that there are k (A: > 2) alternative actions, say 
action 1 , action 2, • • - , action and that one of them is to be chosen 
on the basis of some observations on the random variable x. Suppose, 
furthermore, that the relative degree of preference for these actions 
depends on the value of a parameter B of the distribution of x. Then 
it will be possible, in general, to subdivide the totality of aU possible 
values of B into k mutually exclusive parts <*>j, ^ 2 , — , such that 
action j is preferable to all other actions i ^ j if, and only if, the true 

138 



GENERAL NATURE OF A SEQUENTIAL SAMPLING PLAN 139 


value $ lies in ojj. Let Hj denote the hypothesis that 6 lies in cuj (j = 
1, • ' k). Then the problem of deciding for a particular action re- 

duces to the problem of choosing one of the hypotheses //i, • • H/e. 
If Hi is accepted we decide to take action i. Such a problem may be 
called a multi-valued decision problem, since the decision to be made 
can take k values: We may accept Hi, or H 2 , • • •, or Hk. 

In this section we shall deal with the problem of choosing one out 
of k mutually exclusive and exhaustive hypotheses, Hi, • • - , Hk, on 
the basis of some observations on the random variable x under con- 
sideration.^ The problem of testing a hypothesis is contained in this 
as a special case when k — 2. 

The following simple example may serv'e as an illustration. Suppose 
that X is a measurable quality characteristic of a product which is 
normally distributed in the population of units produced. Suppose, 
furthermore, that the quality of the product is regarded the better the 
higher the mean value 6 of x. Assume that the following three alter- 
native actions are under consideration by the manufacturer: ( 1 ) to 
sell the product at the regular market price, ( 2 ) to label the product as 
second rate quality and sell it at a reduced price, (3) to withhold the 
product from the market. Let a and (a < 6 ) be two values of 0 such 
that the manufacturer prefers action 3 if 0 ^ a, he prefers action 2 if 
a < 6 < b, and he prefers action 1 if ^ ^ 6 . Let Hi denote the hy- 
pothesis that 6 ^ a, H 2 the hypothesis that a < 6 < h, and f /3 the 
hypothesis that 6 ^ b. If the value of ^ is unkno%vn and if the manu- 
facturer must decide which action should be taken on the basis of 
some observations on x, he is faced with the multi-valued decision 
problem of choosing one of the mutually exclusive hypotheses Hi, H 2 , 
and //a. 

10.2 The General Nature of a Sequential Sampling Plan for Select- 
ing a Hypothesis from a Set of Mutually Exclusive Hypotheses 

A sequential sampling plan for choosing one of k mutually exclusive 
and exhaustive hypotheses Hi, ■ , Hk may be described as follows. 
A rule is given for making one of the following (A: + 1 ) decisions at 
each stage of the experiment (at the with trial for each integral value 
of m): ( 1 ) to terminate experimentation with the acceptance of Hi', 
( 2 ) to terminate experimentation with the acceptance of H 2 ', • * • ; (A:) 

» This problem in the non-sequcntial case, that is, when the total number of 
observations to be made is determined in advance, has been treated in several 
previous publications. See, for example, the author's article “Statistical Decision 
Functions Which Minimize the Maximum Risk,” The Annals of Mathematics 
April, 1945. 



140 


MULTI-VALUED DECISIONS 


to terminate experimentation with the acceptance of /f*; (Jo -{- 1) to 
continue the experiment by making an additional observation. Such 
a procedure is carried out sequentially. On the basis of the first ob- 
servation one of the aforementioned {k + 1) decisions is made. If one 
of the first k decisions is made, the process is terminated. If the last 
decision is made, a second trial is performed. Again, on the basis of 
the first two observations, one of the {k + 1) decisions is made. If 
the last decision is made, a third trial is performed, and so on. The 
process is continued until one of the first k decisions is made. 

In more precise mathematical terms, a sequential sampling plan 
may be described as follows. Let Rm denote the totality of all possible 
samples of size m, i.e., Rm is the m-dimensional sample space. For 
each positive integral value of m, the m-dimensional sample space is 
split into {k + 1) mutually exclusive parts, Rm\, Rm 2 y and 

+ If first observation xi lies in Rn where i ^ k, the process 
is terminated with the acceptance of If xi lies in a second 

observation xs is made. Again, if (xj, X 2 ) lies in some R 2 i with i ^ k, 
the process is terminated with the acceptance of /f,*. If (xi,X 2 ) lies 
in R 2 ,k+i a third trial is performed, and so on. This process is stopped 
at the first time when the sample (xi, • • x„») lies in Rmt for some 

value i ^ k. Thus, a sequential sampling plan is completely defined 
by the sets i^mi, • * *, Since these sets are mutually exclusive 

and add up to the whole sample space Rm, it is sufficient to define any 
k of these sets, since they determine uniquely the remaining set. 

For any m, the subdivision of the sample space Rm into the {k + 1) 
parts Rm\, • * + i can be made in many ways, and a fundamental 

problem is that of a proper choice of these sets. In order to set up 
principles for this choice, in the next section we shall study the con- 
sequences of any particular choice. 

10.3 Consequences of the Choice of Any Particular Sequential Sam- 
pling Plan 

After a particular choice of the sets Rmi, * • *» Rm.k+i bas been made, 
i.e., a particular sequential sampling plan has been adopted, for any 
i ^ k the probability that the process will terminate ^vith the accept- 
ance of Hi depends only on the distribution of the random variable x 
under consideration. Since it is assumed that the distribution of x is 
known except for the values of a finite number of parameters ^ 1 , • • •» 
Or, the probability that Hi wdll be accepted will be a function of these 



CONSEQtJENCES OF A SEQUENTIAL SAMPLING PLAN 141 

parameters. To simplify notation, we shall use the letter 0 without 
subscript to denote the set of all r parameters di, • • •, dr- Let Li(6) 
denote the probability that the adopted sequential sampling plan will 
terminate with the acceptance of Hi (i = 1, * • k). We shall refer 

to the set of functions Li(0), 7/2(5), • • 7/* (5) as the operating charac- 

teristics of the sampling plan. We shall consider only sampling plans 
for which the probability is 1 that the process will eventually termi- 
nate. Then we have 

(10;1) 7yi(fl) Lk(d) = 1 

and, therefore, one of the functions Li(0), ■ • •, LkiO) is determined by 
the other k — 1. 

The operating characteristics represent the accomplishment of the 
sampling plan in giving protection against possible wrong decisions. 
For any parameter point 6, the probability of accepting the correct 
hypothesis, i.e., the hypothesis which is consistent with parameter 
point 0, can be obtained immediately from the operating character- 
istics. Since the hypotheses Hi, • • •, 7/* are mutually exclusive and 
exhaustive, for any given parameter point 0 one, and only one, of the 
hypotheses Hi, • • Hk will be consbtent with a given 0. If H, is the 
hypothesis consistent with a given 0, the probability of making a cor- 
rect decision when this 0 is true is equal to 7/,(5). The operating char- 
acteristics of a sampling plan are considered the more favorable the 
higher the probability for making correct decisions for the various pos- 
sible parameter points 0. 

The price we have to pay for the accomplishment of the sampling 
plan in giving protection against wrong decisions is represented by the 
number n of observations required by the sampling plan. Since n is 
a random variable, we shall consider, as in testing a hypothesis, the 
expected value of n. After a particular sampling plan has been 
adopted, the expected value of n will be a function of the parameter 
point 0 only. As in testing hypotheses, we shall denote the expected 
value of n, when 0 is true, by Hg(n), and we shall refer to JS0(n) as the 
average sample number (ASX) function of the sampling plan. 

In conclusion we may say that the most important consequences of 
any particular choice of a sampling plan are given by the operating 
characteristics and the ASX function of the adopted sampling plan. 
The operating characteristics represent the accomplishments of the 

sampling plan and the ASN function represents the price paid for these 
accomplishments. 


142 


MULTI-VALUED DECISIONS 


10.4 Principles for the Selection of a Sequential Sampling Plan 

10.4.1 Dependence of Importance of Possible Wrong Decisions on 
the Parameter Point 0 

To set up principles for the selection of a sequential sampling plan 
it will be necessary to investigate the dependence of the importance 
of possible wrong decisions on the parameter point. Let a>£ denote the 
set of parameter points d consistent with Hi {i = 1, • • k), i.e., Hi is 

precisely the statement that the true parameter point 6 is included in 
coj. If the true $ is in oj,- but not far from ojy for some j ^ i, the accept- 
ance of Hj will not be regarded, in general, as a serious error. How- 
ever, if 0 is far from a>/ and Hj is accepted, the error committed will 
usually be of considerable practical consequence. 

As an illustration, consider again the example given in Section 10.1. 
The decision to withhold the product from the market will be con- 
sidered an error of little practical significance if 6 is only slightly above 
a. The seriousness of this error will, however, increase with increasing 
value of 6. If 0 is substantially above a, the decision to withhold the 
product will be regarded as an error of considerable practical impor- 
tance. Similarly, the decision to try to sell the product at regular 
market price will not be a serious error if 0 is just slightly below h, 
but the importance of this error will increase with decreasing value 
of e. 

It will frequently be possible to express the importance of the var- 
ious possible wrong decisions by k functions wi{0), • • •, Wk{0), where 
Wj{0) is a non-negative function expressing the importance of the error 
committed by accepting //> when 0 is true. In industrial problems, 
W}{0) may be thought of as expressing the financial loss caused by 
taking the action corresponding to the acceptance of Hj when 0 is true. 
We shall, of course, put Wjid) = 0 for all points $ in toj, since for such 
points e the acceptance of //> is a correct decision. We shall refer to 
the functions wi{0), ■ • Wk{0) as error weight functions, or more briefly 

as weight functions. 

The choice of a sampling plan will be influenced by the weight func- 
tions wi{0), • • •, Wk{d). The determination of these weight functions 
cannot be regarded as a statistical problem. They will be chosen on 
the basis of practical considerations in each particular problem. 

10.4.2 The Risk Function Associated with a Given Sampling Plan 

For any parameter point 6 we shall mean by the risk r{0') the ex- 
pected value of the loss caused by possible wrong decisions when is 
true. Since the probability of accepting Hi is equal to Z/,(0) and since 



SELECTION OF A SEQUENTIAL SAMPLING PLAN 


143 


the loss caused by this decision is given by 'Wi{6), the expected value 
of the loss is equal to 

( 10 : 2 ) r{e) = Li{e)w^{d) + L 2 {e)w 2 { 6 ) + ■ • • + Lk(e)wk{e) 

We shall refer to r{9) as the risk function of the sampling plan.® 

We shall judge the relative merits of a sampling plan by its risk 
function r{d) and ASN function E${7i). 

10.4.3 The Risk Function and the ASN Function as a Basis for the 
Selection of a Sequential Sampling Plan 
A sequential sampling plan is the better the smaller the risk t{6') and 
the smaller the expected value E${n) of the number of observations. 
These two desiderata of a sampling plan are somewhat in conflict, since 
the smaller we make r(0), the larger, in general, will be the number of 
observations required by the plan. To achieve a reasonable compro- 
mise between these two conflicting desiderata, one may proceed as 
follows. First we impose the condition that the risk r{d) shall not 
exceed a certain prescribed positive value r©, i.e., 

(10:3) r{d) S ro 

for all parameter points $. We then consider only sampling plans for 
which (10:3) is fulfilled. From this class of sampling plans we try to 
select one for which Eeiri) is as small as possible. 

To impose first the condition (10:3) and then to try to minimize with 
respect to the expected number of observations does not seem to be 
an unreasonable procedure, since the risk function r{6) is perhaps of 
primary importance.® 

The choice of the upper limit r© of the risk is not a statistical prob- 
lem. It will be determined on the basis of practical considerations in 
each particular case. 

* Another possible definition of the risk function could be given by including also 
the expected value of the cost of experimentation. If c denotes the cost of taking 
a single observation, the expected value of the cost of experimentation is equal to 
cEq^u) and the risk is given by 

k 

(10:2*) r-(0) = T, cEAn) 

i= 1 

If the cost of experimentation is not proportional with the number of observations, 
but is given by the cost function c(n), then the term cE^in) in (10:2*) is to be 
replaced by E^{c{n)], 

^ Using the risk function as given in {10:2*), a sampling plan for which the 

maximum value of r^{6) with respect to 9 is minimized may be regarded as an 
optimum plan. If this definition of an optimum sampling plan is accepted, no 
condition of the type (10:3) is imposed; we simply try to find a plan for which the 
maximum of t^{6) with respect to 6 takes the smallest possible value* 


144 


MULTI-VALUED DECISIONS 


10.4.4 The Use of Certain Simple Weight Functions 

The construction of specific weight functions Wiid), • • •, WkiO) in a 
given problem may occasionally run into practical difficulties. Al- 
though in industrial problems Wjid) could be assumed to be equal to 
the financial loss (or estimated financial loss) caused by the acceptance 
of Hj when 6 is true, in purely scientific investigations it is rather diffi- 
cult to give a reasonable measure of the loss caused by accepting a 
wrong hypothesis. 

Even if the difficulties in measuring the loss caused by possible wrong 
decisions are disregarded, we still face the practical difficulty that the 
weight functions wi(6), • • Wki^} in a given problem may be too in- 
volved to be manageable. Thus, there is a need for simplification. 

The choice of the sampling plan is usually not very dependent on 
the exact shape of the weight functions. It will, therefore, be fre- 
quently satisfactory to use some rough approximations, reproducing 
only the main features of the weight functions. A very rough, but 
for many applications satisfactory, approximation can be obtained by 
replacing Wj(d) by defined as follows: 

( 10 : 4 ) «>>( 0 ) =0 if wj(6) is less than or equal to a certain value Cj 

= c if > cj 

where c is some positive constant. Thus, can take only two 

values, O and c. There is no loss of generality in putting c = 1, since 
this can be achieved by multiplication by a proportionality factor 
which has no effect on the selection of the sampling plan. 

In what follows in this and the following section, we shall consider 
only the weight functions Wj{0). We shall call the set of all parameter 
points 0 for which u>,( 0 ) = 0 and u>>( 0 ) = 1 for j ^ i the zone of pref- 
erence for acceptance of Hi. The set of points 6 for which v>i{0} = 
u>^-( 0 ) = 0 and iUkiS) = 1 iov k ^ i, j will be called the zone of indiffer- 
ence between Hi and Hj. Similarly, the set of points 6 for which 
= -w-ie) = Wrr.{e) = o and Wi{e) = l ior l ^ i,j, m will be called 
the zone of indifference among the hypotheses Hi, Hj, and Hm, and 

so on. , 

If we deal with the problem of testing a hypothesis H, then k = 2, 

Hi = H, and H 2 is equal to the negation H of H. The zone of 
erence for acceptance of H, the zone of preference for acceptance of H. 
and the zone of indifference between H and 11 defined here correspond 
to the zone of preference for acceptance, zone of preference for rejec- 
tion, and zone of indifference discussed in Section 2.3.1. 

To illustrate the meaning of the various zones defined here, we con- 



A SPECIAL CLASS OF SEQUENTIAL SAMPLING PLANS 


145 


sider again the example discussed in Section 10.1. In this example Hi 
is the hypothesis that $ ^ a, H2 is the hypothesis that a <. 6 •< b, and 
i /3 is the hypothesis that 0 ^ 6 . The functions and 103(0) 

may reasonably be defined as follows: 


Wi($) = 0 
= 1 
W>2(^) = 0 
^3(0) = 0 


for d < a H" A 

for 0 ^ a -\- A where A is a certain positive quantity 
if <2 — A<5<6 + A and = 1 elsewhere 
if 0 ^ 6 — A and = 1 elsewhere 


Then the zone of preference for acceptance of Hi is the set of values 
of 0 for which 0 ^ a — A. The zone of preference for acceptance of 
Hz is given by the inequality a + A^e<6 — A, and the zone of 
preference for acceptance of i /3 by 0 ^ 6 + A. The zone of indiffer- 
ence between Hi and Hz is given by the inequality a — A<0<a~h 
A, the zone of indifference between Hi and H3 is empty, and the zone 
of indifference between Hz and H3 is given by 6 — A^^<Z^- 1 -A. 
Finally, the zone of indifference among //,, Hz, and H3 is empty. 

When the weight functions wi(0), • • - , Wk(0) are used, the risk func- 
tion r(0) defined in (10:2) takes a particularly simple form. Since 
Wj(0) can take only the values 0 and 1 , we shall have 

(10 ;5) r(e) = 1 : Lj(0) 

J 

where the summation is to be taken for all values of j for which 

Wj(a) = 1 . 

We shall say that a wrong decision is made if, and only if, a hypoth- 
esis Hi is accepted for which u>,(0) = 1. Then the risk r(0) given in 

(10:5) IS simply equal to the probability that a wrong decision will be 
made. 

The principle for the selection of a sequential sampling plan, as 
stated in Section 10.4.3, can now be formulated as follows. We con- 
sider only sequeritial sampling plans for which the probability of mak- 
ing a wrong decision does not exceed a certain preassigned value tq. 
From the class of such sequential sampling plans we try to select one 
for which the expected value of the number of observations required 
by the plan is as small as possible. 


10.6 Discussion of a Special Class of Sequential Sampling Plans 

The problem of finding a sequential sampling p!an which may be 
regarded as an optimum plan in the sense of the previous section is 


146 


MULTI-VALUED DECISIONS 


not yet solved. However, as will be shown in this section, a wide class 
of sequential sampling plans can be constructed for which the condi- 
tion that the probability of making a wrong decision should not exceed 
a preassigned value ro is fulfilled. 

To construct such a class of sampling plans we shall make use of 


the following lemma. 

Lemma, Let xi, X 2 , • • •, ete., he a seqxience of variates, let p\m{^u * * *» 
2 -^) _ 1 2, • • •) denote the joint probability density function of Xi, 

- • *, Xm under the hypothesis H\, and let pom(xi, • - - , Xm) he the den- 
sity function under the hypothesis Ho-* Let, furthermore, A he a con- 
stant greater than one. Then, under the hypothesis Hq, the probability 


that 

( 10 : 6 ) 


Plrn(,^l7 * * *> ^ ^ 

P0rni.^lf * * *> 


unll hold for all values of m is greater than or equal to I — (1/A). 

The validity of this lemma can easily be shown with the help of the 
inequalities given in Section 3.2 by letting the constant JB in those in- 
equalities approach 0. 

With the help of this lemma we can construct a sequential sampling 
plan satisfying the condition that the probability of making a wrong 
decision does not exceed a prescribed value ro as follows. Let 
Pm(xi, be equal to /(xi, 0)f(x2, 6) ■■ ■ f(xm, 8) where f(x, 8) is 

the probability distribution of x when 8 is true. For any parameter 
point 8 let , x,n> 0) be an arbitrary but given probability 

distribution of the variates Xi, Xa, • • Then according to our 

lemma the probability that 


(10:7) 


Pm*(X\, > Xm, 8) 

Pm(xi, ■ • - , Xm, 8) 


will hold for all m is greater than or equal to 1 — (1/ A) when 8 is true. 
For any sample point Er, = (x^, ■ • - , x,.), let <^n(E,d denote the totality 
of all parameter points 8 for which the inequality (10:7) is fulfilled for 
all values m ^ n. Clearly, the probability that the true parameter 
point 8 will be included in all sets oin{En) (n = 1, 2, • • ad inf.) is 
greater than or equal to 1 - (I /A). The sequential sampling plan is 
then defined as follows: We continue taking additional observations 
as long as none of the weight functions Wi(8), - - - , w^^) is identically 
zero in ojn(En)- At the first time when con(E„) is such that at least one 

^ If the distribution of x,. xj, ■ • etc. is discrete, p,mCxi, • • •, x«> denotes the 
probability of obtaining a sample equal to the observed. . ^ 

6 It is understood that the distribution of xi, • x„ determined from the dis- 
tribution x„,s e) (»«' > is identical with pm*Cxi, • • x„. e). 



A SPECIAL CLASS OF SEQUENTIAL SAMPLING PLANS 


147 


of the weight functions wie(0) is identically 0 in we 

stop the process with the acceptance of the hypothesis corresponding 
to the weight function which is identically zero in Obviously, 

this sequential sampling plan will have the property that the prob- 
ability of making a wrong decision does not exceed 1/A. If we let 
A equal l/r©, then the probability of making a wrong decision will not 
exceed r©, as required. 

This method leads to a wide class C of sequential sampling 
plans with the required property, since the distribution function 

Xm, 0) in the numerator of (10:7) can be chosen entirely 
arbitrarily. It is doubtful whether this class C of sampling plans con- 
tains an optimum plan in the sense of the definition given in 10.4. 
If we are willing to restrict ourselves to sampling plans in class C, we 
still have the problem of so choosing x^, 0) as to make the 

expected number of observations required by the plan as small as pos- 
sible. This problem, too, has not yet been solved. There may be some 
waste involved in letting A = l/r©, since this may result in a maximum 
probability of making a wrong decision that is considerably less than 
the tolerated value r©. A further development of the theory may show 
that A can be put equal to some value smaller than l/r© which would 
lead to a saving in the number of observations. 

Although the present stage of the theory is very incomplete, sampling 
plans based on the inequality (10:7) may still be used with good advan- 
tage in some problems. Even if we cannot yet find the best distribu- 
tion • • •, Xm, 6) to be used in the numerator of (10:7), we still 

may be able to make a reasonably good choice of Pm*ixi, - • x^, 6) 

an<l thereby obtain a sequential plan which requires, on the average, 
a substantially smaller number of observations than the best possible 

non-sequential sampling plan based on a predetermined number of 
observations. 


Regarding pos.sible choice.s of pm*(xi, ■■■,Xm,d) which may give 
reasonably good results, the following remarks may be made. A good 
result may be obtained in some problems by letting Pm*(xi, - • - , x,n, 0) 
equal a properly chosen weighted average of p,„(xi, ■ ■ - , f) where 
r IS a variable parameter point. In other words, we let ’ 


( 10 : 8 ) 


X 




® If tht?ro are -several weight functionjs which are identically 0 in we may 

choose arbitrarily one from among the hypothe.^e.s corresponding to these weight 
functions. ^ 

^ I ho averjiKing function pg(r) may also ho discrete. Formulas valid for both 

coiilumous iind discrete averaging functions could be given by asing Stiedtje’s 
inteirrals. ^ 


148 


MULTI-VALUED DECISIONS 


where the integration is taken over the whole parameter space and 
pg(f) is a non-negative function of ^ satisfying the condition 


(10:9) 



1 


The choice of the averaging function p^(r) will depend on the weight 
functions toi(0), - • *, u>k(^)- If, for example, u>j(0) = 0 for the param- 
eter point 0 under consideration, it seems reasonable to let p^(t) = 0 
for all parameter points for which ~ 0, since we are not inter- 

ested in discriminating between parameter points for which the same 
decision is correct. 

The following is another possible choice of * * *, which 

may lead to good results in some problems: 

(10:10) 0) = <P(^U Wfe, I9 i)/(X3, ^ 2 ) • • ■ /(^m, ^m-l) 

where 0r is the maximum likelihood estimate of based on the first r 
observations xi, - - • , x^ and ^(xi, 0) is some suitably chosen prob- 
ability distribution of xi. 

To illustrate the sampling procedure based on (10:7), we shall con- 
sider the following simple example. Let x be normally distributed 
with unknown mean 0 and unit variance. Then 

m 

(10:11) Pm(xi, • • a:,„, 0) = ^ e 

(27r)2 

Let 

(10:12) pi*(xi, • • •, x,„, 0) 

= 0 + 5) -h pmCXi, * • *, Xm, 0 “ ^)] 


where 6 is a given positive quantity. Then 


(10:13) 


rn 


(Xi, ■ • Xm> 0 ) 


— 


p7nU‘l, 




• • 


*, X,n, 0 ) 2 

= e" cosh [5S(x« - 0)] 


The equation 


(10:14) 


cosh u = V (v > 1) 


has two roots in u which are equal in absolute value. Let ^(v) be the 
positive, and — V'(v) the negative root of (10:14). Then the roots of 

the equation in 0 


(10:15) 


e" cosh [02(x« - 0)] = A 



A SPECIAL. CLASS OF SEQUENTIAL SAMPLING PLANS 


149 


are given by 


( 10 : 16 ) 


— Xtn + 


and 


= Xm — 


m£^ 

^(e 2 A) 
mS 

m6* 

^(c 2 A) 

mb 


where Xm is the arithmetic mean of the observations xi, • • Xm. 
The set of all values of B for which the inequality 


B') 

p < A 

* * *> Xftif 0) 


is satisfied is the open interval (02 (^m) , ). The set is 

defined as the common part of the open intervals ( 02 (^i), 0i(£'i)), • • •, 
(02(-£'n)} Bi(En))‘ Hence oJn(En) is equal to the open interval whose 
lower endpoint is equal to the maximum of the values 02 (^i), •••, 
02 (£^n), and whose upper endpoint is equal to the minimum of the 
values Bi(Ei), ■ ■ - , &i {En)-^ Experimentation is terminated the first 
time the open interval Wn(En) is such that one of the weight functions 
^i(0)» '*■» ^fc(0) is identically zero in co„(^„). 

As another illustration, consider again the example given in Section 
10.1, and for simplicity assume that the standard deviation of x is 
equal to 1. Although the proper choice of x,„, 0) for this 

example has not been thoroughly investigated, the following choice of 
Pm (.Xi t ‘ , Xrnt 0) is perhaps not unreasonable. A parameter point 0 

in the zone of preference for acceptance of Hy, i.e., a value 0 ^ a — A,® 
should be discriminated against all other parameter values ^ for which 
acceptance of Hy is a wrong decision. The smallest value f for which 
acceptance of //j is a wrong decision, i.e., the smallest ^ for which 
^i(r) = 1, is f = a + A. Thus, we put 

(10:17) Pm*{xy, - • - , e) = p^ixy, - - - , O + A) 


for all 0 ^ a — A 


If 0 is in the zone of indifference between Hy and Hz, i.e., if a — A < 
^ we want to discriminate 0 against values f for which ac 


* If it happens that the upper endpoint determined 
lower endpoint, the set u,n(Bn) is empty. 

For a definition of the various zones and weight 
for this example see Section 10.4.4. 


in this way is less than the 
functions wi(0), and 


150 


MULTI-VALUED DECISIONS 


ceptance of Hx, as well as of H 2 , is a wrong decision. The smallest 
value of this kind is ^ = 6 + A. Thus, we let 

(10:18) ■ - *, ‘ 6 + A) 

ifa — A<0<a-1-A 


If 0 is in the zone of preference for acceptance of i-e*, if a + A 
^ ^ < 5 — we want to discriminate it against values f for which 
acceptance of H 2 is wrong- The greatest value f of this kind to the 
left of a + A is f = a — A, and the smallest ^ of this kind to the right 
of 6 — A is ^ = 6 -h A. It seems, therefore, reasonable to let 

(10:19) Pm* = i[pm(^l, * * ', a “ A) + pmi^ly ' ' ‘y & + ^)] 

ifa+A^)9<6 — A 

If 6 is in the zone of indifference between H 2 and i.e., if 

b — AS 6 <b-\- A, we want to discriminate 6 against values f for 
which the acceptance of // 2 , as well as of is wrong. Thus, we let 

(10:20) • • •, Xmy ^) = pyni^l, * ' a — A) 

if2> — A<0<6H-A 


Finally, if 0 is in the zone of preference for acceptance of i/ 3 , i.e., if 
0 ^ A, we want to discriminate 6 against values f for which the 
acceptance of H 3 is wrong. The least upper bound of values of r of 
this kind is r = ^ Thus, we shall let 


(10:21) Pm*(Xl, --yXmye) = Pyni^ly ‘ , ^my b - A) 


for ^ ^ 6 + A 


It should be remembered that there is no systematic theory yet 
available for the proper choice of p,„*(:ri, 0). The choice of 

Pm*i^iy ■ ' ^rny 0 ) abovc example has been made only on intui- 

tive grounds. It may well be that another choice of Pm*(.^i> ‘ * *> 
exists which leads to much better results. It should also be remarked 
that it is doubtful whether an optimum sampling plan, as defined in the 
preceding section, is a member of the class of sampling plans based on 
the inequality (10:7). Further investigations are needed to clarify 

these questions. 



Chapter 11. THE PROBLEM OF SEQUENTIAL ESTIMATION 


11.1 Principles of the Current Theory of Estimation by Intervals or 
Sets 

In this section we shall give a brief outline of the basic ideas of 
estimation by intervals or sets as developed by J. Neyman.^ Consider 
first the case in which the distribution of the random variable x under 
consideration is kno\\Ti except for the value of a single parameter 6. 
The problem treated in the current theory is that of estimating the 
value of B on the basis of a fixed number of observations, say N obser- 
vations Xi, * • *, Xf4 on X. 

Let E denote the sample (xi, • • and let d(E) and e(E) be two 

single-valued functions of the sample E such that 

(11*1) ^ for all possible samples E 

Let 5{E) denote the interval extending from e(E) to d(E). We shall 
refer to d(E) also as an interval function, since it associates an interval 
with each sample. Since the interval 5(E) is a function of the samj^le, 
its location and length will, in general, be random variables and, there- 
fore, probability statements can be made as to whether 5(E) includes 
the true parameter value 6 or not. For any value 6 we shall express 
the relation that 5(E) contains 0 by the symbol 5(E)CB. For any rela- 
tion R, the symbol P(rt \ $) will denote the probability that li holds 
when 6 is the true parameter value. 

According to Neyman, an interval function 5(E) is said to be a con- 
fidence interval of 6 if 


( 11 : 2 ) 


P[5(E)Ce 0] = y 


identically in 0 w'here y is a fixed value independent of 0. The relation 
(11:2) simply says this; The probability that 5(E) will include tlio liuc 
parameter value is always equal to y no matter \\'hat the ti iio \'alue ol 
the parameter happens to be. The fixed value y is called the conlidenee 
coefficient associated with the confidence interval 5(E). 


i* “Outline of a Theory of .Statistical Estimation Based on tlie Clas-si- 

cal Theory of Probability,” Philosophical Transactions of the Royal Socielt/ of Lon- 
don, Series A, Vol. 236 (1937), pp. 333-380. 


151 


152 


THE PROBLEM OF SEQUENTIAL ESTIMATION 


Suppose, now, that the distribution of x involves several unknown 
parameters, say 0i, - - &r- Any set of possible values 0i, • • •, can 

be represented by a point 0, called a parameter point, in the r-dimen- 
sional Cartesian space (parameter space). If we want to estimate the 
parameters • • • , jointly, i.e,, if we want to estimate the parameter 
point dy the estimating set will be some subset of the r-dimensional 
parameter space. Whereas in the case of a single unknown parameter, 
estimating sets other than intervals have little practical value, this is 
not so when several unkno^vn parameters are to be estimated jointly. 
Estimating sets other than intervals in the r-dimensional space, such 
as the interior of a sphere, or ellipse, or more general regions, will 
have to be considered. Thus, we shall have to consider a set function 
wiE) which associates with each sample point E a certain subset oj{E) 
of the parameter space without making the restriction that o3{E) is an 

r-dimensional interval. 

A set function o>{E) is said to be a confidence region of the param- 
eter point 6 = ( 01 , • ■ - , 0r) il 

(11 :3) PW{E)Cd I 0] = y 

identically in 0 where -y is a fixed value independent of 6. The value 
y is called the confidence coefficient of the confidence region w(S). 

If only one of the parameters 0i , • • * , 0r is to be estimated, estimating 
sets other than one-dimensional intervals will not be of much practical 
interest as in the case of a single unknoum parameter. Suppose, for 
example, that only 0i is to be estimated. According to Neyman, an 
interval function 6{E) is said to be a confidence interval of 0i with 

confidence coefficient y if 

(11 ;4) F[5(£:)C0i I 01, 02, - ■ '9rl = y 


identically in 0i, 02 , • * ’, , 

Usually there will be infinitely many confidence intervals 5(A; or 

confidence regions o^{E) with a given confidence coefficient y and a 
fundamental problem is to find a proper confidence interval or con- 
fidence region which has some optimum properties. It is clear that a 
confidence interval or confidence region with a given confidence coef- 
ficient y will be regarded the better the shorter the interval or the 
smaller the region. The notion “short*' or “small” is to be made pr^ 
cise since the length of a confidence interval and the size of a confi- 
dence region are random variables depending on the outcome of the 
sample. This has been done in the theory developed by Neyman who 
introduced various notions of optimum confidence intervals and con- 



SEQUENTIAL ESTIMATION BY INTERVALS OR SETS 153 

fidence regions. The mathematical consequences of these definitions 
have been investigated and optimum confidence intervals and regions 
have been derived in many important cases. It is not intended to go 
into further details here and the reader is referred to the original publi- 
cations of Neyman on this subject. 

11*2 Formulation of the Fh’oblem of Sequential Estimation by Inter- 
vals or Sets 

In estimation procedures based on a fixed number of observations, 
we cannot control, in general, the length of the confidence interval 
obtained, since this depends on the outcome of the sample. It may, 
therefore, sometimes happen that the confidence interval obtained is 
so long that it has little or no practical value. The possibility of such 
an event is a drawback, inherent in estimation procedures based on a 
predetermined number of observations. 

For example, the length of the best confidence interval, based on a 
fixed number of observations, for the mean of a normal population 
with unknown standard deviation is proportional to the sample esti- 
mate s of the population standard deviation The sample standard 
deviation s may take any value and is likely to be large if <t is large. 

To devise estimation procedures which lead to confidence intervals 
not only with a prescribed confidence coefficient but also with a pre- 
scribed length, or with a length not exceeding a prescribed value, or 
which satisfies some other similar condition, it is, in general, necessary 
to abandon the approach based on a fixed number of observations, and 
estimation procedures of sequential nature have to be con.structed.^ 

The general nature of a sequential procedure of estimation by sets 
may be described as follows. For any positive integer m we consider 
a set of samples of size m. These sets must satisfy the following 
condition. If the sample E^. is an element of and if E^n- > ?«) 
is an element oi then Km. must not be equal to the sample consists 
ing of the first m observations in E^’- With any clement E^ of 
(m = 1, 2, • - ad inf.), we associate a subset oj{Em) of the parameter 
space.3 The sequential process of estimation is then carried out as 
follows. We continue to make obser\'ations on x until we reach a value 
n such that En is an element of At this stage, we stop the process 

* A very interesting sequential procedure has been devised by C. Stein, “A Two 
Sample Test for a Linear Hypothesis whose Power Is Independent of the Vari- 
ance,” The Annal.<i of M aihemaiical SUUistirs, Vol. XVI, Sept., 1945, which leads 
to confidence intervals of fixed length in an important class of problom.s, including 
the example mentioned before. 

’ If we are concerned with interval estimation, will always bo an interval 


154 


THE PROBLEM OF SEQUENTIAL ESTIMATION 

and state that o»{En) contains the true parameter point, i.e., u>{En) is 
the confidence set resulting from the sequential estimation procedure. 

Thus, a sequential estimation procedure is determined by the sample 
sets Si, jS 2 , • • •, etc., and the set function £o(£7) defined for all samples 
E in Si, S 2 y • - - , etc. The fundamental problem in sequential estima- 
tion is that of a proper choice of Si, S 2 , * * •» etc., and of co(E). First 
we impose the following two conditions : 

Condziton /. The confidence set coCEJ resulting from the sequential 
estimation procedure should satisfy certain stated requirements re- 
garding its geometric shape. 

C(mditi<yn II. The confidence set w(^n) resulting from the sequen- 
tial estimation procedure should satisfy the inequality * 

P[<o(En)Cd \e]^y 


for all parameter points e. (The quantity -y is a fixed value which is 
frequently chosen as high as .95, or more.) 

The requirements to be imposed on the geometric shape of the con- 
fidence set a>(£;„) do not constitute a statistical problem, and they will 
be decided on the basis of practical considerations m each particular 
problem. For example, if there is only one unknown parameter 6 (the 
parameter space is one-dimensional), we may want to require that 
o>{E) be an interval whose length should not exceed some fixed pr^ 

scribed value d, or some given function of the midpoint 

The latter case may be of interest, for example, in estimating the rnean 

of a binomial distribution. If there are several unkno^vn parameters, 

say 0, ■ • - , 0r, and we want to estimate them jointly, we may require 

SL the Euclidean volume, or the diameter ^ of the confidence set 

l(En) does not exceed some fixed prescribed value. If we merely want 

to e-stimate one of the unknown parameters, say Si, we may 

the requirement that ^(E„) be an interval with length 

some prescribed fixed value, or the weaker requirement ^ 

a subset of the r-dimensional parameter space whose projection on the 

^,-axis has a diameter not exceeding some preassigned value. 

Usually there will exist infinitely many sequential estimation pr<^ 
ce?urerwhich satisfy Conditions I and II. The criterion for selecting 
one from among them will be based on the expected number of obser 

' This is weaker than the requirement by Ncyman that the equaUty sign should 

The diameter of a set is the largest possible distance between two points of 
the set. 



A SPECIAL CLASS OF PROCEDURES 


155 


vations required by the estimation procedure. The sequential esti- 
mation procedure may be regarded the better the smaller the expected 
number of observations required by the procedure. Thus, we shall try 
to select a sequential estunation procedure from the class of procedures 
satisfying Conditions I and II for which the expected number of obser- 
vations to be made is as small as possible. 

The problem of finding an optimum estimation procedure is un- 
solved. However, a special class of estimation procedures satisfying 
Conditions I and II ^vill be discussed briefly in the next section. It is 
doubtful whether this class of procedures contains an optimum solu- 
tion in the sense defined before. 

11.3 A Special Class of Sequential Estimation Procedures 

The special class of sampling plans based on the inequality (10:7), 
and discussed in Section 10.5, can be used to obtain estimation pro- 
cedures satisfying Conditions I and II. With each sample point En = 
(xi, ■ • Xn) (n = 1, 2, • • •, ad inf.) we as.sociate the set <*>(£'„) con- 
sisting of all parameter points 6 for which (10;7) is fulfilled for all 
values m ^ n. If we put A = 1/(1 — -y), then ci>(£’„) will satisfy Con- 
dition II for each n. The estimation procedure is carried out as fol- 
lows. We continue taking observations as long as a>(£’„) does not 
satisfy the requirements in Condition I. We stop the process at the 
smallest n for which cj(En) satisfies Condition I and then state that 
the true parameter point 6 is included in oj(En)- This rule of stopping 
insures automatically the fulfillment of Condition I. 

If • • •, Xm, is cho.sen so that the probability is 1 that the 

diameter of a}(Em.) will converge to 0 as m — » oo , and if Condition I is 
such that any set of sufficiently small diameter satisfies it, the prob- 
ability is 1 that the estimation process will be terminated at a finite 
stage. 

It is doubtful whether the .special cla.ss of procedures considered here 
contains an optimum procedure in the sen.se of the preceding section. 
Even if we are willing to restrict ourselves to procedures based on 
(10.7), there is no theory yet developed for the proper choice of 
pm*(3:i, • • *, x„„ e). Our aim is, of course, to choose • • •, Xm, 0) 

so that the expected number of ob.servations required by the pro- 
ced^ure should be as small as possible. An optimum choice of 
Pm • * ' , Xm, 0) will depend also on the nature of Condition I. 

For example, if a certain choice of p^*(x,, ■ • ■ , x^r,, 0) is optimal when 
Condition I requires that the diameter of oj(En) does not exceed a pre- 
as.signed value, thi.s choice will probably not be optimal when Condition 


156 


THE PROBLEM OF SEQUENTIAL ESTIMATION 

I requires that the diameter of the projection of on one of the 

parameter axes does not exceed a preassigned value, and vice versa. 

There may be some waste involved in putting A = 1/(1 — y), since 
this may imply the validity of Condition II for a value y' substantially 
larger than the intended y. A further development of the theory may 
show that A can be put equal to some value smaller tha,n 1/(1 — y) 
which would lead to a saving in the number of observations. 



APPENDIX 


A.l PROOF THAT THE PROBABILITY IS 1 THAT THE SEQUENTIAL 
PROBABILITY RATIO TEST WILL EVENTUALLY TERMINATE 


The sequential probability ratio test terminates at the nth trial 
where n is the smallest integer for which either 


or 


2l + • • • 4 - 2n ^ log A 


Zl + • • • H- ^ log B 


r 

1^2, = log 


Oi) 

^o) 



Let c = I log B I H- I log A |. We shall subdivide the infinite se- 
quence 22, Zz, • • •» ad inf., into segments of length r where r is some 
positive integer. Thus, the first segment Si will consist of the elements 
z\, • • •, 2r. the second segment Sz will contain the elements Zr^\, • • 
Zzr, etc. In general, the /:th segment Sk will consist of the elements 
2(fc__i)r+i, ■ • 2fcr. Let ^k denote the sum of the elements in the A:th 
segment. It can be seen that if the infinite sequence Zi, 22, • • •, ad inf., 
is such that the sequential process never terminates, then we must have 

(A:l) I r* I < c for k = 1, 2, ■ • ad inf. 


Inequality (A:l) can also be w’ritten 


(A: 2 ) for k = 1 , • - - , ad inf. 

Thus, in order to show that the probability is 1 that the sequential 
process will eventually terminate, it is sufficient to prove that the 
probability is 0 that (A: 2 ) holds for all integral values k. For any 
given positive integer i denote by P, the probability that < c^. 
Since 2i, 22, ■ • *, are independently distributed, each having the same 
distribution, the distribution of must be the same for all values i. 
Hence, also P, is independent of i and we shall denote it by P. Since 
f2, • • •, etc., are independently distributed, the probability of the 
joint event that (A: 2 ) holds for A: = 1 , 2 , • - - ,7 is equal to P^. Hence, 
in order to show that the probability is 0 that (A : 2 ) holds for aU values 
k, it is sufficient to show that P ■< 1 . Clearly, if the expected value 
of is > c^, then P must be < 1 . Since the variance of 2,- is assumed 
to be positive, the expected value of can be made arbitrarily large 
by choosing r, i.e., the number of elements in a segment, sufficiently 

157 


158 


APPENDIX 


large. Thus, P < 1, and we have proved the proposition: The 'prob- 
ability is 1 that the seqtiential probability ratio test procedure will even- 
tually terminate. 


A.2 UPPER AND LOWER LIMITS FOR THE OC FUNCTION OF A SEQUEN- 
TIAL TEST 


A.2.1 A Lemma 


In what follows we shall denote the expected value of any random 
variable z by E{z). For any relation R we shall use the symbol P{R) 
to denote the probability that R holds. If the expected value E{z) 
or the probability P(P) has been determined under the assumption 
that e is the true value of the parameter involved in the distribution 
of the random variable under consideration, we shall occasionally put 
this in evidence by using the symbols Eeiz) and P^(/2), respectively.* 
In deriving lower and upper limits for the OC function of a sequen- 
tial test, we shall make use of the following lemma. 


Lemma A.l. Let z be a random variable such that the follouing three 
conditions are fulfilled: 

Condition I. The expected value E{z) exists and is not equal to 0. 
Condition II. There exists a positive 6 such that P(c* < 1 — 5) > 0 


and P{e‘ > I 5) > 0. , , r./ 

Condition III. For any real value h the expected value h\e ) 

Then there exists one and only one real value /iq 5^ 0 such that 



E(e^o^) = 1 


Proof: For any positive h we have 


(A:3) 


g{h) > P(e" > 1 + 5)(1 + 



Hence, since P(e* > 1 -f- 5) > 0, 


(A:4) 


lim gih) = + 

A s eo 


Similarly, we see that for any negative h 

g{h) > P(e* < 1 - 5)(1 - 6)^ 


Hence, since P(e* 
(A:5) 


< 1 — 5) > 0, we have 

lim gih) = -|- oo 
h— — 


* If there are several unknown parameters, say 0i, ■ - - , Ok, then 0 denotes the 
set (di, • • • , 0k)- 



LIMITS FOR THE OC FUNCTION 


159 


Since g'\h) — it follows from Condition II that 

(A:6) g"{h) > 0 

for all real values of h. 

The relations (A:4), (A:5), (A:6) imply that there exists exactly one 
real value h* such that giji) takes its minimum value for h = h*. 
Since (?'(0) = E{z) is unequal to 0 by Condition I, we see that h* 0 
and g{h*) < ^(0) = 1. It is clear that the function g{h) is monotoni- 
cally decreasing in the strict sense over the interval ( — 00 , /i*) and is 
monotonically increasing in the strict sense over the interval (A*, + 00 ). 
Since <?(0) = 1 and g{h*) < 1, there exists exactly one real value 
ho 0 such that gQia) = 1. Hence lemma A.l is proved. 

From the above considerations it follows that if /i* > 0 then also 
ho > 0, and if /i* < 0 then also ho < 0. Furthermore, if A* > 0 then 
E{z) = ^'(0) < 0, and if A* < 0 then E{z) = g'{0) > 0. Hence, A© 
and E{z) are of opposite sign. 


4 

A. 2. 2 A Fundamental Identity 


In this section we shall derive an identity which will play a funda- 
mental role. Consider the sequential probability ratio test for testing 
the hypothesis Hq that the probability distribution of x is given by 
/(x, ^o) against the alternative hypothesis Hi that the probability dis- 


tribution in question is given by /(x, $ 1 ). Let z = log *^-^^-^ — — and 

^0) 

Zi = log *’ \ where x» denotes the ith observation on x. As defined 

JyXii 6 q) 

in Section 3.1, the test procedure is given as follows. Continue taking 
observations as long as 


(A;7) log B < zi Zm < log A 

where A and B (B < A) are constants determined before the experi- 
mentation starts. Accept Hq when 

(A:8) H h2«^log^ 

and reject Hq (accept Hi) when 

(A .9) ^ [Qg 

* From Condition III it follows that all derivatives of g(h) exist, and they may 
be obtained by differentiation under the integral sign, i.e., 

d’'g(h) 

(r = 1,2, • • ad inf.) 


160 


APPENDIX 


In what follows we shall denote by n the number of observations re* 
quired by the test. Clearly, n is a random variable. Let D' be the 
subset of the complex plane such that “ ^(0 exists and is fimte 

for any point t in D\ Consider the follo^^’ing identity: 


(A:10) 


+ (ZaT — ZnX^ 




where denotes a positive integer and Z,- = Zi + • • • -b Let P n 
be the probability that n N. For any random variable w, let Em{u) 
denote the conditional expected value of u under the restriction that 
n ^ and let denote the conditional expected value of u 

under the restriction that ti >■ AT. Then identity (A; 10) can be writ- 
ten as 


liV 


(A:ll) + (1 - PN)Ej,^*{e^n = [0(OJ 

Since in the subpopulation defined by any fixed n ^ N the expression 
Z^' — Zn is independent of we have 


(A:12) 




From (A:ll) and (A:12) we obtain the identity 

(A:13) + (1 “ PN)EN*{e^^') = [^(0]^ 

Dividing both sides by [0(^)]^ we obtain 

PArPAr{e^"'[<^*(0]“ 


(A:i4) 




[0(0] 


Let D" be the subset of the complex plane in which [ 0(0 | ^ 1 
and let D denote the common part of the subsets D' and D . Smce 
lim (1 - Pn') = 0. and since | EN*<.e^'“) 1 is a bounded function oi N, 

ll have in O 

(A:15) (1 - 


N 


[0(0]" 


= 0 


Since 


limF;v£vlc^"'[0(O]-"l = E^e^-'m)]-"] 

9 «o 


we obtain from (A:14) and (A:15) the fundamental identity 


(A:16) 


E{e^''\<t>(.t)]-’'\ = 1 


for any point t in tho sot D. 



LIMITS FOR THE OC FUNCTION 


161 


A.2.3 Derivation of Upper and Lower Limits for the OC Function 

The OC function of the sequential test is defined by the function 
L(^>, where L{0) denotes the probability that the seciuential process 
leads to the acceptance of //q Avhen 6 is the true value of the pa- 
rameter.® It has been shown in Section A.l that the probability is 
0 that the sequential pxocess will never terminate, i.e., the relation 
P (n = oo) = 0 has been pro^’ed. Thus, the probability that the proc- 
ess ^\'ill terminate with the rejection of Hq (acceptance of Hy) is given 
by 1 — Lie). Using the fundamental identity derived in the pre- 
ceding section we shall obtain upper and lower limits for L{d). 


It will be assumed that the distribution of 2 = log 




satisfies 


/(.f, Oo) 

the three conditions of lemma A.l for any value 6. Then for any given 
B there exists exactly one real value hiB) ^ 0 such that — 1. 

Substituting hiO) for t in the fundamental identity (A;1G), we obtain 

(A:17) = 1 

since = 1. 

Let Ee* be the conditional expected value of unde]* the restric- 

tion that Hq is accepted, i.e., that ^ log P, and let Ee** be the 
conditional expected value of under the restriction that //i is 

accepted, i.e., that > log A. Then we obtain, fi*om (A:17), 

(A;18) [Lie)]Ee* + [1 - L{e)]Ee** = 1 

Solving for L{e) we obtain 

(A:19) Lie) = 


Eg** - 1 


Eg** - Eg* 

If both the absolute value of Egiz) and the variance of 2 are small, 
as they will be when fix, B^) is near fix, ©o), then Eg* and Eg** will 
be nearly equal to and respectively. Hence, in this case 

a good approximation to Lie) is given by the expre.ssion 


(A :20) 


LiB) = 


- 1 




This is the approximation formula (3:43) given in Section 3 4 It is 

easy to verify thtit_hi0) = 1 if 0 = Bo. and h(B) - -I {( 6 = B^. The 

diffeience L{0) LiO) approaclies 0 if both the moan and tlu? \^ariancp 
of 2 conv’erge to 0 . 

’For simplicity the ciu^e of :i sin^lc imkn.mn parameter 0 di.seussed, but the 
results can obviou.sly be extended to any number of parameters. 


APPENDIX 


162 

To judge the goodness of the approximation given by it is 

desirable to derive lower and upper limits for Li{0)- Such limits can 
be obtained by deriving lower and upper limits for Ee* and Be - 
First we consider the case when hid') >0. To obtain a lower limit for 
Ee* consider a real variable T which is restricted to values > 1. For 
any random variable u and any relation R we shall denote by Eiu \ R) 
the conditional expected value of u under the restriction that R holds. 
Let PeiO denote the probability that Then we 

have 


(A:21) Ee 


f. 




( 


.hce)z 1 




Hence, a lower bound of Ee* is given by 


(A:22) 




where the symbol g.l.b. stands for greatest lower bound with re- 
spect to r. Since S'-'®’ is an upper bound of Ee*, we obtain the limits 


( 


A:23) [g.hb. CEe (c'"'”' I ^ i)] ^ ^ 


[liW > 01 


To derive limits for Ee** consider a real variable p which 

0 and < 1. Let Q(p) denote the probability that 


to values > 

^ Then we obtain 


(A:24) 


Fr** 


0 




fl(0) 


. 1 ^ dQ(p) 


Hence an upper bound of Ee** is given by 


(A:25) 


A 


h(0) 


[l.mb. pEe | S i)] 


Since a lower bound of Ee**, we obtain the following limits 

forFV*: 

l^l.u.b. pEo I ^ -)J 

[h(0) > 0] 


(A:2G) ^ Eo** S A 



LIMITS FOR THE OC FUNCTION 


163 


Putting 

(A:27) g.Lb. | g = 

and 

(A;28) l.u.b. pEe | e*'")" ^ -) = 

inequalities (A:23) and (A:26) can be written as 

^ Ee* ^ 

< 1 and A > 1/ we see Ee^ < 1 and £^ 0 ** > 1 if h{Q) > 0. 
and relations (A: 19), (A:29), and (A;30), it follows that 

- 1 ^ ^ 5eA^^^^ - 1 

~ ~ 5gA^^^> - 

> 0 . 

If k{d) < 0 , limits for L($) can be obtained as follows. I^et z' = — 2 , 
A' = l/B and B' — 1/A. Consider the sequential test S' defined as 
follows. Continue taking observations as long as log B' < z'l • 
“b m < log A'. Terminate the process with one or the other decision, 

depending on whether 2^1 H b ^ m ^ log B' or ^ log A'. We shall 

let L'(e) be the probability that at the termination of the process the 
cumulative sum z'l + - . . + z'm is less than or equal to log B\ Then 
L'id) = 1 — L{e). Furthermore, we shall denote the quantities h{d), 
va, 60 corresponding to the test S' by h'(e), rj'e, and d'g, respectively. 

We can apply (A:31) to the test S', since k'{e) = —k( 6 ) > 0. Thus, 
we obtain 


(A:29) 

and 

(A :30) 

Since B 
From this 

(A:31) 

where k{&) 


m 

Sg 


(A;32) 


- 1 


^ 6'gA'^'^^^ — 1 


where h'(e) > 0. Since rje and 5e depend only on the distribution of 
/i(0)2, and since h'{0)z' = h(d)z, we have v'e = va and 5'g = Sg. Sub- 
stituting, in (A:32), dg for S'g, Tjg for ij'g, l/B for A', 1/A for B', —h(d) 
for h {$), and 1 — for we obtain 

MVe have assumed that B < A. Since we let B = # 3/(1 - «) and ^ = ( 1 - B)/<^ 

we must have 0/{l — a) < (1 — 0 )/a, Multiplying this inequality by a(l — a), 
we obtoin ctB < 1 _ « - ^ + ^0, i.e., O < 1 - « - Hence ^ < 1 - o: and 
> — 0 > or, and therefore B < 1 and ^ > 1 . 


164 


APPENDIX 


(^= 33 ) ^ACfl) _ = 1 — _ ^A(e) 

where hi&) < 0. Hence 

(A:34) - A*<« “ “ S'‘<»> - 

where h(0) < 0 . 

We can summarize our results as follows. If h(0') > 0, limits for 
L(0) are given in (A:31). If Hd) < 0, limits for L(e) are given in 
(A:34). The quantities 5^ and ije are defined in (A:27) and (A:28). 

In Sections A.2.4 and A.2.5 we shall calculate the values of 5^ and 
rje for binomial and normal distributions. If the limits of L(d) as given 
in (A:31) and (A:34) are too far apart, it may be desirable to deter- 
mine the exact value of Z/(^), or at least to find a closer approximation 
to Lis') than that given in (A:31) and (A:34). A method of dealing 
with this problem is described in Section A. 4. There the exact value 
of L(0) is derived when z can take only a finite number of integral 
multiples of a constant d. If z does not have this property, arbitrarUy 
fine approximations to the value of L(d) can be obtained, since the 
distribution of z can be approximated to any desired degree by a dis- 
crete distribution of the type mentioned above if the constant d is 

chosen sufficiently small. 


A.2.4 Calculation of and r)g for Binomial Distributions 

Let X be a random variable which can take only the values 0 and 1. 
Let Pi be the probability that X = 1 when is true (i = 0, 1^ Let 
H be the hypothesis that p is the probability that X — 1. Denote 
1 by 3 and 1 - Pi by 3 .; (r = 0, 1). The distribution f{x, p) of x 
is giv^sn as follows: /{I, p) = P and /(O. p) = 3- It can be assumed 
without loss of generality that p. > po- The moment generatmg 

function of z = log 



Let h{p) ^ Ohe the value of t for which <p(t) 




= 1 



LIMITS FOR THE OC FUNCTION 


165 


First we consider the case when h(p) >0. It is clear that = 

f(x 

> 1 implies that x = 1. Hence > 1 implies that 

fix, po) -1 


[ 


.*A(p) 


= r /a>pi) p _ /^v 
L/(i, po)J Vpo/ 

given in (A:28) it follows that 
(A:35) Sp 


From this and the definition of 6 




Po 


J 


where hip) > 0. Similarly, the inequality < 1 implies that 

From this and the definition of vp given in (A:27] 

it follows that 


(A:36) 



where hip) > 0. 

If hip) < 0, it can be shown in a similar way that 


(A:37) 

where hip) < 0, and 
(A;38) 

where hip) < 0. 



A.2.6 Calculation of 8$ and for Normal Distributions 

We shall now assume that X is normally distributed with unknown 
mean 6 and known variance We can assume without loss of gener- 
ality that <r = 1, since this can always be achieved by multiplication 
by a proportionality factor. Then 

(A:39) fix, 9.) = — ^ e- (i = 0, 1) 

and 

(A:40) fix, 9) = e- 

\/27r 

We can assume without loss of generality that ~ and = A 

where A > 0, since this can always be achieved by a translation. Then 



/(-r, ^i) 

fijr, do) 


= 2Ax. 


(A:41) 


166 


APPENDIX 


The moment generating function of z is given by 
(A:42) ^,(6“") = 


Hence 

(A:43) 


d 

h(e) = - - 

A 


Substituting this value of h(d) in (A:27) and (A:28) we obtain 


(A :44) 
and 
(A :45) 


Sa = l.u.b. pEa 

na = g.bb. ^Ea | 


*9 

^ 7 ) 


For any relation R let denote the probability that the rela- 

tion R holds under the assumption that the distribution of x is normal 
with mean 6 and variance unity. Furthermore, let P0**(/2) denote the 
probability that ii! holds if the distribution of x is normal with mean 
— d and variance unity. Since is equal to the ratio of the normal 

probability density function with mean — 6 and variance unity to the 
normal probability density function with mean d and variance unity , 


we see that 


(A:46) 






and 

(A:47) Ee I ^ 


Pe** 


Pe* 


Pe** 


Pe* 




It can easily be verified that the right-hand members of (A;46) and 
(A -47) have the same values for 0 = X as for 0 = —X. Thus, 5^ and 

■ne also have the same values for 0 = X as for 0 X. It wUl therefore 

be sufficient to compute de and ne for negative values of 6. L«t 6 = 
— X where X > 0. First we show that rie = 1/5®. Clearly, 



(A:48) 




(1 ^ r < ") 



LIMITS FOR THE OC FUNCTION 


167 


Letting f = (1/p) (0 < p ^ 1) in (A:48) gives 






(A:49) 


(-“ * j) 


Pe 




2Xa: 




Pe 


■ (- * j) 


pp,. (.-»■ a i) 


Hence 


(A :60) ve = 


g.l.b. 

r 




* ;) 


(-“■ - i) 


Pg* • -2Xx 



1-u.b 


Because of the symmetry of the normal distribution, it is easily seen 
that 


l.u.b. 

P 


Hence 

(A;5I) 




Now we shall calculate the value 



Then 




Let G(x) denote 



Similarly 





1 

log - -h 
P 



168 


APPENDIX 


Let u denote (l/2\) log (1/p). Since p can vary from 0 to 1, w can 
take any value from 0 to « . Since p = we have 


(A:52) 8e = l.u.b. 


pPe 


Pa 


We shall prove that 
(A;53) 


V “ p/ / G{u - X)\ 

— = l.u.b. ( ) 

(,2X. ^ J “ 

^ (0 ^ ^ «>) 


-.„x 

X(W) = € ^ 


' ' ^ ' Giu + X) 

s a monotonically decreasing function of u and consequently has a 
maximum at u = 0. For this purpose it suffices to show that the de- 
rivative of log x(w) is never positive. Now 

(A:54) log x(w) = log G{u — X) — log G{u + X) — 2Xu 

Let ^(x) denote -^= e~ Since <?(u) = — it follows from 
(A:54) that 

d ^(u — X) ^(u + 

(A :55) - log X(«) = - + G(u + X) 


It follows from the mean value theorem that the right-hand side of 

(A:55) is never positive if ^ ^ 

values of u. Thus, we need merely to show that 


(A:56) 


d _ 

lG(u)l 


du 


<^(a) 


^'{u)Giu) — G'(w)4>(w) 

G^(u) 

4»2(u) 4'{w) ^ ^ 

G^M ~ ^ <?(w) ^ 


^'(u)G(w) + 4»2 (u) 


G^M 


Let y denote — — . The roots of the equation — uy — 1 = 0 a.re 

G(w) 


^ ^ 

Hence the inequality y^ — uy — I ^ 0 holds if, and only if, 


-u — V w 


H- 4 


S y ^ 


LIMITS FOR THE OC FUNCTION 


169 


Since y cannot be negative, this inequality is equivalent to 


rA:57) 


u + "v -f- 4 


G{u) 


= y ^ 


Thus we merely have to prove (A:57). We shall show that (A:57) 
holds for all real values of u. Bimbaum ® has shown that for w > 0 


(A;58) 

Hence 

(A:59) 


4- 4 _ 


u 


4>(w) ^ G(u) 


4»(u) 


■v + 4 4- u 


C?(u) “ Vu2 + 4 - 


u 


(u > 0) 


which proves (A:57) for w > 0. Now we prove (A:57) for u ■< 0 
u = —V where v > 0. Then it follows from (A:59) that 


Let 


(A;60) 


4»(y) 


G(v) V 4 __ y 


Taking reciprocals, we obtain, from (A:60), 


(A:61) 

Since 


Giv) VTTv 


— V 


0(u) 2 

G(u) G{v) -f- 2v4>(v) 


-ve obtain, from (A:61), 




Gjv} 


-b 2v 


(A -62) + 4 + 3;; ^ + i + v 

^(u) “ 2 ~ 2 

Taking reciprocals, we obtain 

^ 2 + 4 — « a/u^ + 4 4- u 

C?(u) ~ 4- 4 4- V 2 2 

Hence (A:57) is proved for all values of u and consequently 5g is equal 
to the value of the expression (A:53) if we substitute 0 for u. Thus 


(A :63) 


G(-\) , , 

“-liar 


5 Z. W. Bimbaum, “An Inequality for Mills’ Ratio,” The Annale of Mathematical 
Statistice, Vol. XIII (1942). 


170 


APPENDIX 


Formula (A:63) has been derived for the case in which 6o — —A, 
$1 — A, and o- » 1. It can easily be seen that for general values Bof 
$ 1 , and <r we have 


(A:64) 



G{-\) 

GiX) 


where X =® 


1 

<r 


^0 + ^1 

0 

2 


A.3 UPPER AND LOWER LIMITS FOR THE ASN FUNCTION OF A SE- 
QUENTIAL PROBABILITY RATIO TEST 

A.3.1 Derivation of General Formulas for Upper and Lower Limits 

As before, let 



/(J=. Oi) 
fix, 0o) ' 




fjXj, gj) 

fixi, 0o) 



ad inf.) 


and let n be the number of observations required by the sequential 

test, i.e., n is the smallest integer for which H H «« is 

either ^ log A or ^ log B. To determine the expected value Bin) of 
n under the hypothesis H that 6 is the true value of the parameter, we 
shall consider a fixed positive integer N. The sum = Zi + • • • + 
can be split in two parts as follows: 


(A:65) -h Z\ 

where Z'n = «n+i H \- zjv n ^ N and Z'n Zn — Z^H n > N, 

Taking expected values on both sides of (A:65) wc obtain 

NEgiz) — EgiZn + 


Let Pn denote the probability that n ^ N. Then 
EgiZn + -^'n) == P^EeNiZn + Z'n) + (1 — 

where the operator Eg^ means conditional expected value when n ^ N, 
and EgN* means conditional expected value when n > N. 

Since Z^ lies between log B and log A when n > N, and since 
lini (1 — Pn) = we obtain from the last two equations 

lim [NEgiz) — P^BgNiZn + 2^'n)] = 0 


(A:66) 



LIMITS FOR THE ASN FUNCTION 


171 


For any given value of n < AT, the variates * * '» O’!*© inde- 

pendently distributed, each having the same distribution as z. Hence, 
we have 

= E9n{N - n)EQ{z) = -E9N{n')Eeiz) + NE^i^z) 

From this and (A:66) we obtain, since Urn (1 — = 0,* 

(A:67) lim = 0 

Since 

Urn PArPflAr(n) = P«(n) and lim PNEewiZn) = Ee{Zn) 

<9 • 


equation (A:67) gives 
(A :68) 

Hence 

(A;69) 


Ee{Zn) = E0(n)Ee{z) 


Ee(n) 


EejZrd 

Ee(z) 


if Eeiz) ^ 0. Let Es*(Zn) be the conditional expected value of Z„ 
under the restriction that the sequential analysis leads to the accept- 
ance of Ho, i.e., that Zn ^ log B. Similarly, let Ee**(Zn) be the con- 
ditional expected value of Zn under the restriction that Hi is accepted, 
i.e., that Z„ ^ log A. Since Z/(0) is the probability that Zn ^ log B, 
and 1 — E{6) is the probability that Zn ^ log A, we have 


(A:70) Ee(Zn) = [L(d)]P,*(Z,) + [1 - 


From (A:69) and (A:70) we obtain 


(A:71) 


Eein) 


[L(e)]Ee*(Zn) + [1 - LmEe**iZn) 

Eeiz) 


The exact value of E${Zn), and therefore also the exact value of 
P^(n), can be computed if z can take only integral multiples of a con- 
stant d, since in this case the exact probability distribution of Z„ was 
obtained (see Section A. 4). If z does not satisfy the above restriction, 
it is still possible to obtain arbitrarily fine approximations to the value 

1 C. Stein has shown, in “A Note on Cumulative Sums,” The Annals of Mathe- 

maixcal Statistics, Vol. 17 (1946), that all moments of n must be finite. This implies 

that lim (1 — = 0 for any positive integer k. 

^ 0 «0 


172 


APPENDIX 


of since the distribution of z can be approximated to any de- 

sired degree by a discrete distribution of the type mentioned above 
provided the constant d is chosen sufficiently small. 

If both I E( 2 ) 1 and the standard deviation of z are small, Ee^iZrd 
is very nearly equal to log B and Ee**iZrt) is very nearly equal to 
log A. Hence in this case we can write 


(A:72) 


Eein) 


[L(e)] log g + [1 - Lie}] log A 

Ee{z) 


This is the same approximation formula as given in (3:57). 

To judge the goodness of the approximation given in (A:72) we shall 
derive lower and upper limits for Eff(n) by deriving lower and upper 
limits for and Ee**{Zn). I-et r be a non-negative variable 

and let 

(A:73) ^0 = Max Be(z - r\z ^ r) (r ^ 0) 

r 

and 

(A:74) = Min Eeiz -b r | z + r ^ 0) (r ^ 0) 

r 


It is easy to see that 

(A:75) log a ^ E 0 **{Zn) ^ log A H- 

and 

(A:76) log^ + ^ ^ log 

We obtain from (A:71), (A:75), and (A. 76) 

uenios B + 1 ',) + [1 - hm log^ ^ 

[Lie)] log S H- [1 — Lie)] (log A + fg) 

- E,(.z) 

and 

[Lie)] log g + [1 - Lie)]i\o^ A + ^ 

Eeiz) 

^ U e)ilo& B + 4- [1 - Lie)] log A ^ Q 

= ~ E^) 

The limits given in (A:77) and (A:78) will generally be close to each 
other for values ^ and 0 ^ 0i. However, for values 6 between 


if Eeiz) > 0 



LIMITS FOR THE ASN FUNCTION 


173 


do and $i the difference between the upper and lower limits may be- 
come very large, since Ee{z) may be near (or equal to) 0 for such 
values 9. In fact, we have seen that E$Qiz') < 0 and E$^(z) > 0. 
Hence, if E$(z) is a continuous function of 9, there will be a value 9' 
between 9o and 0i such that Ee'{z) = 0. For 9 = 9' or for values 9 
very near 9' the limits given in (A:77) and (A:78) are of no practical 
value, since they are far apart. 

We shall now derive limits for Eein) which can be used for values 9 
in the neighborhood of 9'.“^ For this purpose, we shall expand 
in a Taylor series as follows : 

(A:79) = 1 + h(,a)Z„ + 

where X is some value between 0 and h(9)Zn^ From (A:17) and (A:79) 
we obtain 


(A:80) H9)Ee{Zn) = -^[hi9)]^Ee{Z^^) - \[h{9)]^Ee{Zn^e>') 


From this and (A:69) it follows that 


(A:81) 


EeM = - 


h{9) 

2Ee{z) 


Ee(Zn^) 


[hi9)]^ 

GEffiz) 


Ee(Zje^) 


Thus, upper and lower limits for £'e(n) can be obtained by deriving 
upper and lower limits for E$(Zn^) and To derive limits 

for E${Zr?), we write 


(A:82) EeiZr,^) = L{9)Ee*{Zn^) + [1 - L{9)]Ee**{Zn^) 


where the operator E* stands for conditional expected value when 
Zn ^ log B, and E** stands for conditional expected value when 
Zn ^ log A. Let «' denote — log 5 and denote Zn — log A. 
Then 

(A;83) £s»(Z„2) = (log + 2(log 

and 

(A:84) = (log A)” + 2(log A)£«»*(0 + £,**(e"2) 

Since & 0 and (log iJ)£’e*{e') a 0, we obtain, from (A:83), 

(A:85) (log £ £9*(Z„)2 

* See also the author’s paper, “Some Improvements in Setting Limits for the 
Expected Number of Observations Required by a Sequential Probability Ratio 
Test,” The AnnaU of Mathematical Statistics, Vol. 17 (1946). 


174 


APPENDIX 


The quantity given in (A:74) is a lower bound for Since 

log B <i 0, (log is an upper bound for (log B')E 0 *{€'). An upper 

bound for is given by 

(A:86) = Max Eeliz + r)^ | a + r ^ 0] (r ^ 0) 

r 

Hence 

(A;87) E 6 *(Z„^) ^ Cog + 2(log 3)^6 + f's 

Thus we obtain the limits 

(A:88) (log ^ Ee*{Zj^) ^ (log B)^ + 2(log B)$'^ + t'e 

In a similar way, the following limits can be derived for E 0 **{Zr ^') : 

(A;89) (log A)2 ^ Ee^*{Zn^) ^ (log A)^ + 2(log + te 


where is given in (A :73) and 

(A:90) rp = Max Eeiiz — r)^\z ^ r] (r ^ 0) 

r 

If we denote by L*i 6 ) the lower limit and by B”{&) the upper limit 
of E{e') given in (A:31) [(A;34) when hm < 0], we obtain from (A:82), 
(A:88), and (A;89) the following limits for EeiZn^): 

(A:91) Z.'(0)(log + [1 - L"ie)]{\<yQ A)^ % Ee{Zn^) 

S. L” {e)[i\oz B)^ + 2(logB)^'o + f'e] + 

[1 — I/'(e)][(log A)^ + 2(log A)^e H- fel 

Using a similar method, one can also derive upper and lower limits 
for Ee{Zr,^e^} without any difficulty. We shall, however, not derive 
such limits here, since we are interested in obtaining limits for Ee(n) 
when 6 is near 6 ' and since, for such values of 0, the second term in 
the right-hand member of (A:81) is negligible. We shall show that, if 
h(e), Ediz), and Ee{z^) are continuous functions of 0, the factor 
{h{e)]^ /{Eeiz)] in that term converges to 0 as 0 0'. It follows from 

the discussion given in Section A.2.1 that lim /i(0) = 0. Since 


(A:92) Ee Hd^z -h 


[hm^ 


2! 


[h{e)r 

3! 




1 


= 1 


(0 ^ ^ 1 ) 


we obtain, when ^(0) ^ 0, 


(A:93) 


Ee 


Hd) 2 


2! 


2^ + 


[hm^ 

3! 




}=o 



LIMITS FOR THE ASN FUNCTION 


176 


Thus 

(A:94) 


Eeiz) 

h{e) 



m 

3! 





[hid) ^ 0] 


Assuming that Eeie^*^) is a bounded function of 0 in the neighbor- 
hood of $\ we see that Ee(\ z ‘ is also a bounded function of 

0 in a sufl&ciently small neighborhood of B' ? Hence, is 

also a bounded function of 0 in the neighborhood of 6* . From this and 
(A:94) it follows that 


(A:95) 


lim 


E^iz) 


e=‘9’ kid) 


- - EA^^) < 0 


From (A:95) it follows that 


(A :96) 


[hid)Y 

lim 

8 = 9' E$iz) 



The lower and upper limits for E$in), based on (A:81), will generally 
be close to each other for values ^ in a small neighborhood of d*. Thus, 
when d is near B' these limits for E$in) can be used instead of the limits 
given in (A;77) and (A:78). 

It may be of interest to determine the limiting form of (A:81) when 
B = B'. If E$iZr^) is a continuous function of B and EeiZ^e^) is a 
bounded function of 0 in the neighborhood of d\ it follows from (A:81), 
(A:95), and (A;96) that^ 


(A:97) 


Ee’ in) 


Eb. jZA) 

Ee^iz^) 


The boundedness of EgiZy^e^) can be proved if, for ^ = ±1, the ex- 
pected value pEe (e- 1 e- ^ 1) is a bounded function of B and p 
(0 < p < 1). Since lim hid) = 0, there exists a constant C such that 

I Zn^e^ I ^ for B in the neighborhood of 6'. Hence, we merely 

have to show that is bounded. Since ^ c' it 

is sufficient to show that both Eeie^’") and Eeie~^'‘) are bounded. We 
have 

Eeie^" I ^ log A) ^ A l.u.b. ^pEg | ^ j 

‘This follows from the fact that | h{e) \ < 1 when e is sufficiently near 6'. 

* A different method for deriving (A;97) was given in the author's paper, “Dif- 
ferentiation under the Expectation Sign in the Fundamental Identity,’’ The Annals 
of Mathemaiical Statistics, Vol. 17 (1946). 


176 


APPENDIX 


where 0 < p < 1. Since 


we obtain 


Zn ^ log B) ^ B 


£e(e^”) S A l.u.b. ^pEe {e^ | e" S B 


The right-hand member of this equation is bounded, since 

pEe (e^ 1 e* g i) is bounded by assumption. Hence Ee{e^’') is bounded. 

The boundedness of Ee(e~^”) can be shown in a similar way. Upper 
and lower limits for Ee^in) can be obtained from (A;97) by substitutmg 
for Ee-iZ^') the upper and lower limits given in (A;91). 

We shall now compute an approximate value of neglecting 

the excess of Zn over the boundaries. Since hifi) = 0, we obtam, 

from (3:43), 

log A 

Eie') - 


(A :98) 
Hence 

Ee^iZn^) 


log A — log B 


log A 


log A — log B 


(log B)^ + 


— log B 


log A — log B 


— (log A) 


= — log B log A 


Thus an approximate value of Ee‘{n) is given by 


(A:99) 


Ee'in) = 


Ee’jZn^) 

Ee' 


— log B log A 
Ee>{z^} 


If the OC function L{e) of the test is known exactly, close lunits for 
P.fn't can be derived which remain valid over the entire range of 9. 
We shaU indicate briefly the derivation of such limits. Denote by 
Mz) the distribution of 2 when 6 is the true value of the parameter 
By the distribution of 2 conjugate to the distribution *( 2 ) we shall 
mean the distribution e*'"’ */<, ( 2 ) . In important cases, such as for bi- 
nomial and normal distributions, to any given value 0 of the param- 
eter there will correspond a value 6 such that Mz) is conjugate to 

6 W Allen Wallis obtained this approximation formula independent^ of the 
author It is included in the publication of the Statistical Research Group oi 
Col^bia University, Techniques of Statistical Analysis, Chapter 17. Section 7. . 

McGraw-Hill, New York (1946). 



LIMITS FOR THE ASN FUNCTION 


177 


fe{z)y i.e., /§(z) = We shall call 6 conjugate to $. It has 

been shown elsewhere * that 


(A:160) = 




Lie) 


and = 


1 - Lie) 


On the other hand, 

(A:101) 

[hid)f 




1 + 


2^v 


[Zn - Ee*iZn)Ve 


where v lies between 0 and /i(^)[Z„ — E$*iZn)]- Similarly 
(A:102) 


[HB)? o • 

1 + [Zn - Ea**(Zn)}^e^ 


where v' lies between 0 and h(e)[Zn — £^e**(Z„)]. From (A:100), 
(A:101), and (A;102) we obtain 


(A:103) 


Ee*(Zn) 

Eeiz) 


log 


Lie) 


hie)E 0 iz) Lie) 
1 


and 

(A:104) 


Ee**iZn) 

Eeiz) 


hie)Eeiz) 

1 


log^l |[Z„ - 


hie) Eeiz) 


log 


2 

1 - Lj^ 

1 - Lie) 


Thus 

{A:105) Eein) 


,og (^1 + 


hie)Eeiz) 


Ee** {[Zy, - £r0**(ZO]V' ) 


where 

(A;10G) 
R = 


hie)Eeiz) 


Lie) 1 — z,(^i 

Lie) log — + [1 - Lie)] log- — d- R 


Lie) 


1 — Li$) 1 


1 r / \hie)f 

= - r / ' j. , ' Lie) log [ I 

hie)Eeiz) L \ 2 


[Z„ - Ee*iZ 




4- 


[I - Lie)] log( 1 + Ee**\\y. - 




2 - Ee**iZ 


oiv'Di 


“See, for instance, the author's article on “Some Generalizations of the Theory 
Cumulative Sums of Random Variables,” I'he Annalu of athernatical Statistics 
Vol. XVI (1945). 


178 


APPENDIX 


Since h(6)Ee{z) ^ 0 (see Section A.2.1), we see that ^ 0. Hence a 
lower boiand for Eein) is obtained by substituting 0 for R in (A:105). 

To obtain an upper bound for E^in) we shall derive an upper bound 
for R. Clearly 

(A:107) { (Z„ - log B) + {Ee*(^Z„) - log ^ [Z„ - 

whenever Zn ^ log B. From this and (A:76) we obtain 
(A:108) [(Z„ - log B) + ('ef ^ IZn - Ee*{Z„)f 


whenever ^ log B. Similarly, we obtain 
(A:109) [(Z„ - log A) + €9]® £ [Z„ - Ee**(.Z„)f 

whenever Zn ^ log A, where is given by (A:73). From (A:107), 
(A:108), and (A: 109) it follows that 


(A:110) Eg*llZn - Eg*(Z„)]‘e''} 

^ E,*[(Zn - log B + f' 9 ) V '] 

and 

(A .111) Ee**{[Z„ - Eg**(Z„)fe’’'} 

< Eg**((Z„ - log A + «9)"e'^"- I] 

Furthermore, we have 
(A:112) Eg*[(Zn - log B + 

^ Max Eg*l(z -h r + {'«) V ^ ^ 01 = p' 

r^O 

and 

(A:113) Eg*n(Zn - log A + '] 

g Max Eg**[(z - r + *«> ' [ * - r & 0] = p" (say) 

rSO 

From (A:10G) and (A;110) through (A:113) we obtam the folloi.ving 
upper bound for R : 

(A:114) « ^ 


I 2 j 


[1 — L(0)] log 


1 + 


[h(0)] 


// 


n 


An upper limit for ^(n) is obtained by substituting R for R in 
(A: 105). The value of R will generally be small over the entire range 

of 



LIMITS FOR THE ASN FUNCTION 


179 


A.3.2 Calculation of the Quantities and 1'^ for Binomial and Nor- 
mal Distributions 

Let X be a random variable which can take only the values 0 and 1. 
Let the probability that X = 1 be denoted by d. Then the distribu- 
tion of X is given by /(x, 5), where /(I, &) = d and /(O, 0) = 1 — 6. 
Let Hi be the hypothesis that 6 = di {i = 0, 1). It can be assumed 

fiXf ^i) 

without loss of generality that $i > 6 q. It is clear that log — — — r > 0 

J\JCy ffo) 

. fa,0i) 

imphes that x = 1 and consequently log — — — r = log — — — = 

f\Xf Oq) ^o) 

log — . Hence 

I 

(A:115) = Max Ee(z — r | 2 ^ r) = log — 

r Oq 

f(x, Oi) 

Since log ^ 0 implies that x = 0, we have 

fix, 60) ^ 

(A:116) ^'e = Min Eeiz H-rlz-hr^O) = log 

r 1 - $0 

Now we shall calculate the values $0 and when X is normally 
distributed with unit variance. Let 


fix, di) = 


and 


V 2 




fix, S) = — = 6 

\/ 27r 


- Md-e)* 


We may assume without loss of generality that 60 =— A and 61 = A 
where A > 0, since this can always be achieved by a translation. 
Then 


(A:117) 


, fix,d^) 

z = log = 2Ax 

fix, do) 


Let ^(x) denote — 7= e ^ and let G 

i = X — B. Then z = 2A(( + B) and 
(A;118) Eeiz - r | 2 - r ^ 0) 


^ and let Gix) denote — 7= I e ^ dt. Let 

“V 27r 


= 2AEe (i + 6 —\t-\~e ~ 

\ 2A ’ 2A 


^0) 


2A 


J r*« 2A 

it - /o)‘i^(0 di = [-toGHo) + ^ito)] 

'o Gilo) 


APPENDIX 


180 


where 

(A:119) 


<0 <9 

2A 


In Section A.2.5, equation (A:56), it was proved that [*^(<o)/(7(io)3 
— is a monotonically decreasing function of Hence the maxi- 
mum of E${z — r \ z — r^0)is reached when r = 0, and consequently 


(A:120) ^8 = 


2A 


[eG{-e) + 


G{-d) 

Now we shall calculate We have 

(A:121) = Min Ee{z + r I « H- r ^ 0) 


-e)] = 2A^ 


+ 




-e)A 


G(-0) 


— — Max E$( — z — r | — z — r ^ 0) 

r 


— — 2A Max 


[ax Ee ^ — X — 


2A 


-X - — ^ 0 ) 

2A / 


Let t = — X + 6 and Iq = (r/2A) -f- 6. Then 


(A:122) Ee(^-x - ^ I ^ - <o I < - <o S 0) 


/ 


Hto) 


Gito)Jic G{to) 

Since this is a monotonically decreasing function of to, we have 

r . r \ 4>(0) 


— <0 


(A:123) "M-ax Ee 


\ 2A ' 2A / 


0(0) 


— 0 


From (A;121) and (A:123) we obtain 

= — 2A 


(A: 124) 




-«] 


10 ( 0 ) 

Formulas (A:120) and (A:124) have been derived for the case when 
Oq = — A, = A, and <r = 1. For general values 0o, 0i, and v, the 
values of and are given by 


(A:125) 
and 
(A: 126) 
where 


(^1 

<T 


- do) 


4>(-^ 


z!>' 


Gi-O) 


1 -1 
r, = - - (01 - 0 o) - e\ 


0 = 




00 “H ^1 


) 



EXACT FORMULAS FOR OC AND ASN FUNCTIONS 


181 


A.4 DERIVATION OF EXACT FORMULAS FOR THE OC AND ASN FUNC- 
TIONS WHEN z CAN TAKE ONLY A FINITE NUMBER OF INTEGRAL 

MULTIPLES OF A CONSTANT 


In this section we shall derive exact formulas for the OC and ASN 


functions when z = log 




can take only a finite number of inte- 


^o) 

gral values of a positive constant d. This is a rather general result, 
since any distribution of z can be approximated arbitrarily closely by 
a discrete distribution of the above type if the constant d is chosen 
sufficiently small. 

To obtain the exact OC and ASN functions, we shall first derive 
the exact probability distribution of the cumulative sum Zn — 
2^1 “h • • * -h 2n at the termination of the sequential process. In what 
follows in this section the probability of any relation and the expected 
value of any random variable are determined under the assumption 
that Q is the true value of the parameter.* However, to simplify nota- 
tion, we shall not put this in evidence in the formulas, i.e., we shall 
write P instead of Pq and E instead of Ee^ Let Qi and ^2 be two posi- 
tive integers such that P{z = ~g\d) and P{z — g2d) are positive and 
z can take only integral multiples of d which are ^ —g\d and ^ g^d. 
Denote P{z = id) by hi. Then the moment-generating function of z 
is given by 

(A:127) 


E{e‘') = ^ hie‘^ = 0(0 


(say) 




To obtain the roots of the equation 4 >{t) = 1, we let = u and 
solve the equation: 


(A:128) 




t « “I7i 


Ugy re- 
u 


Let g denote gi -f- 32 and let the g roots of (A:128) be Wi, • • • 
spectively. We shall assume that no two roots are equal, i.e., Ui ^ 

for i ^ j. Substituting w* for in the fundamental identity (A:16) 
we obtain 

Zn 

(A:129) E(ui^) = 1 (^ = 1 , . . g) 

Let [a] be the smallest integer ^ logA/d, and [6] the largest integer 
^ (logE)/d. Then Z^/d can take only the values 

(A:130) 

((fcl - 9 . + 1 ), ([b] - 9 , + 2 ), ■ ■ [ 6 ], [a], ([a] + 1), • • ■. ([a] + 92 - 1) 

' If there are several unknown parameters, e denotes the set of all parameters. 


182 


APPENDIX 


Denote the g different values in (A:130) by ci, • • respectively. 

Furthermore, denote P{Zn = cn^) by Then equations (A:129) can 
be written as 

g 

(A:131) = 1 (i = 1, • • •, g) 


Let A be the determinant value of the matrix (t, 7 = 1, • * *, p) 

and let Ay be the determinant we obtain from A by substituting 1 for 
the elements in the jth column. If A ^ 0, it follows from (A:131) that 


P{Zn = cyd) = ^y is given by 
(A: 132) 



Thus, the probability L.{&) that the process will terminate with Z« ^ 


log B is given by 

(A;133) AW = 

J ^ 


where the summation is to be taken over all values j for which dcy ^ 
log B. Equation (A:133) is an exact equation of the OC function. 

From the probability distribution of we can easily derive the ex- 
pected value Ee{n) of n. In fact, in Section A.3 it has been shown that 


But 

(A:134) 

Hence 
(A: 135) 


^ ^ EeiZn) 
Eein) — ■ : 7 “ 

Eeiz) 


CyAyd 

EeiZ^) = ^ — 


y=i 


Ee{n) = 


— Z) 


CjAjd 


Eeiz) A 


is the exact equation of the ASN function. ^ 

The method of obtaining the probabilities •••,$«, as described 
above, requires the computation of the roots of the polynomial equa- 
tion (A:128). This is not necessary, however, if a method given by 
Girshick is used.^ Girshick proceeds as follows. Multiplying 

- 1) by and - 1) by we obtain two 

polynomials /(w) and F(w), where /(w) is of degree gi -\- gz = 9 and 

2 M. A. Girshick, “Contributions to the Theory of Sequential Analysis,” The 
Annals of Mathematical Statistics, Vol. 17 (1946). 



EXACT FORMULAS FOR OC AND ASN FUNCTIONS 


183 


F{u) of degree -t- [a] — [b] — 2. According to (A:128) and (A:131), 
every root of /(u) is also a root of Fiu). Hence 


F{u) = /(u)/*(u) 

where /*(w) is a polynomial of degree [a] — [6] — 2, i.e., 

ia] - [6J - 2 


/*(u) = ko kiU H h kfax-ji,\-2U^°^ ^ 


Putting the coefficient of any power of u in F(u) equal to the coef- 
ficient of the same power of u in /(u)/*(u), we obtain a system of 
9 {<A ~ [^] “ 1 linear equations in the g + [a] — [b] — 1 unknowns 

^ij ‘ ^g, ko, kiy • ’ k[a\-[b)- 2 , from which these unknowns can be 

determined. Thus, the probabilities •••, can be determined 
without solving the poljmomial equation (A:128). This advantage is, 
however, bought for the price of an increased number of linear equa- 
tions to be solved. If the roots of the polynomial equation (A:128) 
are computed, only g linear equations have to be solved for determin- 
kig $ 1 , * • If Girshick’s method is used, no polynomial equation 

is to be solved, but the number of linear equations is increased to 
g [a] — [b] — 1. 

If 92 = 1, the OC function L(d) is a simple expression of the roots 
ui, - • Ug. In fact, L(e) = P(2„ ^ log B) = 1 — PiZn ^ log A) = 
1 — ^g. We have 


A = 


and 




(61 — t? I + 1 


Ui 


( 6 ) 


Ui 


(a) 


= 


♦ 4 4 

ft ♦ « 

• • « 


• • # ♦ • 

♦ ♦ « 

» * 

ft # ft 

1 

• # 

« ♦ ♦ • # 

• ♦ 

^ ft ft 

♦ ft ♦ 

ft ^ 

1 


The value of the ratio Ag/ A is not changed if we multiply the fth 
row of A, as well as that of Ag, by Thus 


A 


1 

ft 

Ui 

ft ft 

♦ ft • 

ft ft 

• • » • 


1 

ft • 

Ug 


* • » • 


1 

Ui 

m ^ 

* # ♦ 

• « • • 

- 1 +[a) - (6) 

* • « • * V 

# 

1 

• « 

• # • 

■ * • • 

— l-l-[cj] — [6] 


184 


APPENDIX 


The cofactor of each element in the last column is a Vandermonde 
determinant, expanding the determinants in the numerator and de- 
nominator according to their last columns and dividing numerator and 
denominator by the Vandermonde determinant, 


we obtain 


^s = -r - 


A 


L ui 

• • • 

Ui^ 

• m 

• • • 

• • » 

L Ug 

« • 

Ug°^ 

0 

- 


t* 1 

(Ui 

- 1)11 


= g — 1 ) 


} 


o n-[aj — [*1 1 

iui — — Uj)\ 


We shall illustrate the derivation of the exact OC and ASN func- 
tions by a simple example. Let a: be a random variable which can 
take only the values 0 and 1. Denote by Hi (i = 0, 1) the hypothesis 
that the probability that x = 1 is equal to pv {i = 0, 1). Let 


1 — e 


—2 


Po = 


—2 


and Pi = 


e — e 


— 1 


e — e 


—2 


e — e 

Consider the sequential test for testing Hq against Hi. We shall com- 
pute the probability that the process will terminate with the accepts 
ance of Hq, and the expected number of trials required by the test, 
when the true probability that x - 1 is equal to p = In what 

follows in this section, all probabUity statements and expected values 

refer to the case when p = M- 

First we compute 0(0 = Since 2 can take only the values 


V\ 

log — = log e = 1 
Po 


, , 1 — Pi . 

and log ^ = log e 


—2 _ 


= —2 


1 — Po 

with probabilities ^ and respectively, we have 

0(0 = f 

Letting = u and solving the equation 

3 4 1 



CHARACTERISTIC FUNCTION AND MOMENTS OF n 


185 


we obtain the roots Ui ~ 1, 1/2 = 2, and U 3 = — The integers 
Cl, C2, C3 are given by 


Cl = log B — 1, C2 = log B, C3 = log A 


Hence 


A = 


Ai = 


Ao — 


A-a = 


B—1 


2 I 0 B 
( - 1 ^ 


— I 


1 1 
1 


1 

B 

1 

A 

( - 1 ^ 


1 

^oz A 
( - §)‘°' ^ 


2I0S ^—1 
( - §)‘"* ^ 

1 

2log B-^l 


-1 


1 

1 

1 


1 

^oe A 


— 1 


1 1 
2log B 

(_2)lo*B 1 


Then the probability that Ho will be accepted is given by 


B = 


Ai 4- A2 


The expected value of n is given by 


E(n) = 


1 CiAi -f- C2A2 4- C3A3 


E(z) A 

_ 7 — (— logB 4- l)Ai 4- (logB)A2 4- (logA)A3 

5 A 

_ 7 ( — log B 4- l)Ai 4- (— log B)A2 — (log ^)A3 
5 A 


A.6 THE CHARACTERISTIC FUNCTION AND HIGHER MOMENTS OF n 

A.6.1 Derivation of Approximate Formulas Neglecting the Excess of 
the Cumulative Sum over the Boundaries 

Let Zn be a random variable _defined as follows; Z„ = log A if 

•2™ = ri 4 \- Zn ^ log A, and Zn = log B if ^ log B. Denote 

the difference Zn — Zn by «. Then « is a random variable. 


186 


APPENDIX 


In what follows in this section we shall neglect «, i.e., we shall sub- 
stitute 0 for e. No error is committed by doing so in the special case 
when z can take only two values, d and — d, and the ratios (log A)/d 
and (log B) /d are integers, since in this case € is exactly 0. Apart 
from this special case e \vill not be identical with the constant 0 . 
However, the smaller | E{z) \ and the smaller the error we com- 

mit by neglecting €. In fact, for arbitrarily small positive numbers 
5i and 82 the inequality P(| e | ^ 61 ) — 82 will hold if | E{z) \ 

and P(z^) are sufficiently small. Thus, in the limiting case when E{z) 
and E{z^) approach 0 , the random variable e reduces to the constant 0 . 

As in the preceding section, all probability statements and expected 
values will refer to the case in which 6 is the true parameter point, 
without putting this in evidence in the formulas by using 0 as a sub- 
script to the operators P and E. Let <^(0 be the moment generating 
function of 2 , i.e., 

0(0 = P(c'') 

To derive an approximation to the characteristic function of n, we 
shall consider the equation 

(A:136) — *og <^(0 = 7- 

where t is a purely imaginary quantity. It will be assumed that z 
satisfies the conditions of lemma A.l. Then, according to lemma A.l, 
the equation - log = 0 has exactly two real roots m f, they are 
« = 0 and t = h (Ji ^ 0). Furthermore <^'(0) and <#>'(A) both are un- 
equal to 0. Hence, if <#>(<) is not singular at t = 0 and t = h, equation 
(A136) has two roots, t,(r) and for sufficiently small values of 

I T I such that lim <i(r) = O and lim hir) = h. Identity (A:16) can 

be written as 

(A:137) + (1 “ L)E**{e^’'W0]-^] = 1 

where L denotes the probability that the test procedure leads to the 
acceptance of Ho, E* stands for conditional expected value under the 
restriction that the process leads to the acceptance of Ho, E** stands 
for conditional expected value under the restriction that the process 
leads to the rejection of Ho- Neglecting the excess of over the 
boundaries, we have Zn = log B when the process leads to the accept- 
ance of Ho, and Z„ = log A when the process leads to the rejection 
of Ho- Hence (A:137) can be written as 

(A:138) LP'P*[0(O]“" -h (1 - L)A'P**l0(O]“" = 1 



CHARACTERISTIC FUNCTION AND MOMENTS OF n 


187 


This identity is valid for all values of t for which | | ^ 1.^ Letting 

t = <i(r) and t = i 2 (‘r), we obtain, from (A:138), 

(A:139) + (1 - = 1 

and 

(A:140) 4- (1 - L)A‘^^^^E**ie^) = 1 


Solving these equations in and E**(e^), we obtain 


(A:141) 

and 

(A:142) 












B 


tzir) 


(1 — 


for all imaginary values r. 

The unconditional expected value E(e^*^) is clearly equal to 


(A:143) 


E(e^^) = LE*(e^^) + (1 - L)E**(e^^') 


Hence, the characteristic function of n is given by 


(A:144) 


= Eie-^) 




(for all imaginary r). 

By definition, the expected value E(e^^) is the characteristic func- 
tion of n, and (A:144) gives the desired approximation formula when 
the excess of Zn over the boundaries can be neglected. Our deriva- 
tions yield also approximation formulas for = E*(e^**} and 

s= The function can be interpreted as the char- 

acteristic function of the conditional distribution of n when the process 
leads to the acceptance of Hq, and ^**(r) can be interpreted as the 
characteristic function of the distribution of n in the subpopulation 
of samples leading to the rejection of Hq. 

As an illustration we shall determine ^*(r), and ^(r) when 

z has a normal distribution. Denote by the mean of z and by <t the 
standard deviation of z. Then equation (A:136) can be written as 


— log 4>{t) 



T 


'This follows from the considerations in Section A. 2.2, since D' is the whole 
complex plane in our case. 


APPENDIX 


188 


Hence 

(A:145) 

Thus 

(A;146) 

and 

(A:147) 



where the sign of -y/ is determined so that the real part of 

- 2,t='t is positive. Substituting these values for tiCr) and t 2 (r) 
in (A:141), (A:142), and (A:144), we obtain and ^(r) in 

the case when z is normally distributed. According to formula (3:43), 
an approximation to is given by 


(A:148) 



When z is normally distributed we have 


(A:149) 



It is of interest to consider the following two limiting cases: (1) 
B = 0 and A is a finite positive value; (2) B is a finite positive value 
and A = + «> - It can be shown that Ein) will be finite in case (1) 
only if E{z) > 0. Similarly, B(n) will be finite in case (2) only if 
Eiz) < 0. Thus, in case (1) we shall assume that E(z) > 0, and in 
case (2) we shaU assume that E(z) < 0. To obtain the characteristic 
function ^(t) of n in case (1), we have to determine the limiting value 
of the right-hand member of (A:144) when B -> 0 For th^ purpose 
we shall first derive the limiting value of _ B “ ' * " when 

B 0. Since in case (1) E(z} is assumed to be > 0, the quantity 
h — lim< 2 ( 7 ') niust be negative, as has been shown in Section A.2.1. 

Hence, Ifor small r the real part of « 2 (t) is negative. On the other hand, 
the real part of hir ) approaches 0 as t 0. Thus, for small r the 
real part of hir) — ii(r) is negative, and, therefore. 


(A:150) 


lim I 

B=0 


+ <» 


From (A:150) and from the relation lim | | = «, it follows that 

B ^ 0 

with B 0 the right-hand member of (A:144) converges to 


(A:151) 


A-^i(^) 



CHARACTERISTIC FUNCTION AND MOMENTS OF n 


189 


Thus, if E{z) > 0 the characteristic function of n in case (1) is given 
by (A:151). When z is normally distributed, ii(r) is given by (A:146). 
Hence, for normally distributed z with m > 0 the characteristic func- 
tion of n in case (1) is given by 


(A:152) 


—i — Vp* — 2^Zr 

A*"* «ra 


In case (2) we have assumed that E{z) < 0. Hence ^ 2 (^) and 
^(t) — ^i(r) will have a positive real part for small t. Thus, 

(A:153) lim | [ = lim | | = + 

A • • A “ • 

From (A;153) it follows that the limiting value of the right-hand mem- 
ber of (A:144) when A — ► co is given by 

(A;154) 


Thus, if Eiz) < 0, the characteristic function of n in case (2) is 
given by (A: 154). 

The moments of n can be obtained by differentiating the character- 
istic function of n. For any positive integer r the rth moment of n is 
given by 

(A:155) Ei-nT) = 

We can also obtain the conditional moments of n in the subpopula- 
tion of samples for which ^ log B, as well as in the subpopulation 
of samples for which Zn ^ log A. Let £*( 71 ^) denote the conditional 
expected value of n*’ in the subpopulation Zn ^ log B, and let E**(n^) 
denote the expected value of n*" in the subpopulation Zn ^ log A . 
Then we have 

d’’ d** 

E*(n^) = and E**(n^) = 

dr’’ 

where and are the conditional characteristic functions 

given in (A:141) and (A:142). 

It may be of interest to note that — and, there- 

fore, also E(n ) = can be obtained from identity (A:138) di- 

ar 

rectly by successive differentiation. In fact, (A; 138) can be written as 
(A:156) log<^(01 H- (1 - L)AV**[- log = 1 


190 


APPENDIX 


Taking the first r derivatives of (A:156) with respect to i at ^ = 0 
and t = hy we obtain a system of 2r linear equations in the 2r un- 


knowns ^*(t) 

dr 


and 

o 


U - 1 , 


= 0 


these unknowns can be determined. For example 


, r) from which 


dr 


and 


-0 


dV>**(T) 

dr 


can be determined as follows. Taking the first deriva- 


= 0 


tive of (A:156) with respect to t we obtain 

, ,^'(0 dV '*( r ) 

(A:157) £,(log - LB^ — + 

^(0 dr 

A'(0 

(1 - Z,)(log - (1 - L)A‘^^ — 0 

[r = — log 

Letting t = 0 and t = h we obtain the equations 

d^p*iT') 


(A:158) L log ^ — L 


<t>(0) dr 


+ 


(1 - L) log A - (1 - L) 


and 


0^(0) d^**(T} 
<t>(j0) dr 


= 0 


o 


(A:159) L(log B)B^ - LB 


0(/l) dr 


4- (1 - L)(log A)A* - 


(1 - L)A 




from which 


dxi'*(r) 

dr 


and 


o 


d^P**{r) 

dr 


dr 

can be determined. 


= 0 


A.5.2 Derivation of Exact Formulas When z Can Take Only a Finite 
Number of Integral Multiples of a Constant 

We shall use here the notation defined in Section A.4 without any 
further explanation. Let ^iCr) denote the characteristic function of 
the conditional distribution of n in the subpopulation of samples for 
which Zn = c.d (f = L - - • , The equation in t 

(A:160) <^(0 = 

has g roots fifr), • • *, such that 

fA iei) — Ui {i = 1, • ' - t g) 

' ■ r ==0 



APPROXIMATE DISTRIBUTION OF n 191 

The fundamental identity (A:16) can be written as 

a 

(A:162) log •#>(«)] = 1 

Substituting Uir) for t in (A:162), we obtain 

a 

(A:163) = 1 (t = 1, • • •, s) 

These equations are linear in the unknowns ^i(r), • • •, yj/gir), and the 
determinant of these equations is given by 


(A:164) 

5(t) = 

^^gC,n(r)d 

• « • • 

• • • 




• • - 


Obviously, 5(0) = ^ 1^2 * * * Hence, if ^t* 0 (z = 1, • • and 

A 5 *^ 0, then 5(0) 7 ^ 0, and consequently 5(r) 5 *^ 0 for any r with suffi- 
ciently small absolute value. Thus, ^i(t), • • •,^g(r) can be obtained 
by solving the linear equations (A:163).^ The characteristic function 
^(t) of the unconditional distribution of n is given by 

g 

(A:165) V'(r) = 

For any positive integer r, the exact rth moment of n, i.e., is 

given by the rth derivative of ^(r) ^vith respect to t at r = 0. 

A.6 APPROXIMATE DISTRIBUTION OF n WHEN z IS NORMALLY DIS- 
TRIBUTED 

A.6.1 The Case When B — 0 and A Is Finite 

In this case we have assumed that E(z) = m > 0- Then the ap- 
proximate characteristic function of n, if the excess of over the 
boundaries is neglected, is given by (A:152). Let 

(A:166) m = — -n 

* This method of determining •••, ^^(r) requires the computation of 

the roots of equation (A:160). This can be avoided, as Girshick has shown in 
his paper mentioned in Section A. 4, if a device is used similar to that applied by 
him for determining |i, ••*,£< (see Section A. 4). 


192 


APPENDIX 


Then the characteristic function of m is given by 

/A .T./4\ 




c = 


(A:167) Hi) = 

where 

aix 

(A:168) c = -j > 0 

<7^ 

and 

(A:169) a = log A 

The sign of the square root in (A:167) is determined so that the real 
part of 1 — t is positive. The distribution of m is given by 


(A:170) 

Let 

(A:171) 

and 

(A:172) 

Since 

(A:173) 

we have 


2iri 




— vTw> — m£ 


n 


VI — 


f. 




_}_ — =-^( ^ e 

27ri dt 27rt \2\/ 1 — t / 


^i^rrU 




(A:174) ^ 2^ 

From (A:171) and (A:172) we obtain 

aH(c, m) . _ 


— cVi — < — 


= O 

— t «© 


(A:175) 




-h (?(c, m) = 0 


From (A:174) and (A:175) it follows that 


(A:176) 

Hence 

(A:177) 


c dH{c, m) 

- Hie, m) H- m = 0 

2 ac 


log Hie, m) = — h log X(7n) 

4m 


where X( 7 n) is some function of m only. Thus 


(A:178) 


4 

Hie, m) = X(m)e 


4fn 



APPROXIMATE DISTRIBUTION OF n 


193 


Now we shall determine X(m), We have 
(A:179) X(m) = m) = T* ^ 

27ri «/-.*«> -V 1 — < 




Since (1 — 0”^ is the characteristic function of where has the 
X^-distribution with one degree of freedom, the right-hand side of 
(A:179) is equal to 

1 

e 




Hence 

(A:180) 


X(m) = 


r(*)v^ 


From (A:178) and (A:179) we obtain 

1 


.8 


(A:181) 


H(C, TYl) = 


r(i)V^ 


£ 
4m 


— — — m 


From (A;174) and (A:181) we obtain 

c 


(A:182) 


(?(c, 7n) = 


2r(i)m« 


c 
4m 


— — — m 


Hence the distribution of m is given by 


(A:183) F(7n) dm = 




dm 


(0 ^ m < oo) 


2r(i)m« 

Let m = (c/2)m*. Then the distribution of m* is given by 

- I +-"*-2) 


(A:184) D(m’*') dm* = 



dm 


vs 


The function (1/m*) -f- m* — 2 is non-negative and is equal to 0 only 
when m* = 1, If c is large, then D{m*) is exceedingly small for values 
of m* not close to 1. Expanding (1/m*) -f- m* — 2 in a Taylor series 
around m* = 1, we obtain 


m 


m 2 — (m* — 1)® -h higher order terms 


(A:185) 


194 


APPENDIX 


Hence for large c 

(A:186) Dint*) dm* ^ ® ^ ^ 

i.e., if c is large m* is nearly normally distributed with mean equal to 
1 and standard deviation l/\/c. 


A.6.2 The Case When B > 0 and A = co 

In this case we have assumed that Eiz) = /j. <. 0. It can easily be 
shown that the distribution of m = itJ.^/2a^)n is now given by the ex- 
pression we obtain from (A: 183) if we substitute ifx/a^) log B for c. 


A.6.3 The Case When B > 0 and A Is Finite 

In this case the approximate characteristic function of n, if the 
excess of over the boundaries is neglected, is given by (A:144) 
where hir) and t^ir) are equal to the right-hand members of (A:146) 
and (A:147), respectively. Let 

m = — r ti and d — — 

Then the characteristic function of m is given by 

_j_ Qhi _ _ Qhi 

(A;187) A^^B^^ 


where 
(A: 188) 


u = _ Vl - O, ^2 = dil + Vl - o 


and t is an imaginary variable. Letting A‘^ — A, — S, da 
and db = b, the characteristic function of m can be written as 


= d, 


(A:189) 

” AB(^l — 

It will be sufficient to consider only the case when m > 0, since the 
case when m < 0 can be treated in a similar way. Then o < 0 and 



APPROXIMATE DISTRIBUTION OF n 


195 


6 > 0. Since the real part of + a/ 1 — t is greater than or equal to 1, 
we have 


(A: 190) 



< 1 


for any imaginary value of i. Let 

(A:191) T = 

Then 


(A: 192) 





From (A:189) and (A:192) it follows that 
form of an infinite series: 


(A:193) 


oo 


^'(O = 

i= 1 




can be written in the 


where X* and r,- are constants and X»- > 0. Each term of this series is 
a characteristic function of the form given in (A:167) except for a 
proportionality factor. Let F,(m) be the distribution of m correspond- 
ing to the characteristic function Then Fi(7n) can be ob- 

tained from (A: 183) by substituting X,- for c. Since we may integrate 
the right-hand member of (A:193) term by term, the distribution of 
m is given by 

(A:194) F(m) dm = [s F,-(m) j dm 

A.6.4 Some Remarks 

Since m is a discrete variable, it may seem paradoxical that we 
obtained a probability density function for m. However, the explana- 
tion lies in the fact that we neglected « = Zn — Zn and this quantity 
is 0 only in the limiting case when /x and <t approach 0. 

If 1 M I and <r are sufficiently small as compared with log A and 
1 log B I, the distribution of m given in (A: 194) will be a good approxi- 
mation to the exact distribution of m, even if z is not normally dis- 
tributed. The reason for this can be indicated as follows. Let 


2 ;* = ^ 2 ^. (i = 1, 2, ■ ■ ■, ad inf.) 

where r is a given positive integer. Since the variates zj are inde- 
pendently distributed, each having the same distribution, under some 
weak conditions the variates (i = 1, 2, - • - , ad inf.) will be nearly 
normally distributed for large r. Hence, considering the cumulative 


196 


APPENDIX 


sums Zi* = Zi* -h 22 * 4- * * • “f- 2 »* (z = 1, 2, • • •, ad inf.), the distribu- 
tion given in (A:194) is applicable ^vith good approximation, provided 
that I M I and -\/rcr are small compared with log A and | log B \ so 
that the difference = ^n* — Zn* can be neglected. 

It would be desirable to derive limits for the error in the cumulative 
distribution of m caused by neglecting Zn — Zn- No such limits have 
yet been obtained. 

A.7 EFFICIENCY OF THE SEQUENTIAL PROBABILITY RATIO TEST 

Let S be any sequential test for testing Ho against Hi such that 
the probability of an error of the first kind is a, and the probability 
of an error of the second kind is 0, and the probability that the 
test procedure will eventually terminate is 1. Let S' be the se- 
quential probability ratio test whose strength is equal to that of S. 
We shall prove that the sequential probability ratio test is an optimum 
test, i.e., that Ei(n \ S) ^ Ei(n \ S') (i = 0, 1), if for S' the excess of 
Zn over log A and log B can be neglected.' This excess is exactly 0 
if z can take only the values d and — d and if log A and log B are 
integral multiples of d. In any other case the excess will not be iden- 
tically 0. However, if | E(z) \ and the standard deviation of z are 
sufficiently small, the excess of Z„ over log A and log B is negligible. 

For any random variable u, we shall denote by Ei*(u \ S) the con- 
ditional expected value of u under the hypothesis Hi (i == 0, 1) and 
under the restriction that Ho is accepted. Similarly, let Ei**(u \ S) 
be the conditional expected value of u under the hypothesis Hi 
(z = 0, 1) and under the restriction that Hi is accepted. In the nota- 
tions for these expected values, the symbol S stands for the sequential 
test used. Let QdS) denote the totality of all samples for which the 
test 5 leads to the acceptance of Hi. Then we have 


(A:196) 

i^o*( 

'V\n 1 


PilQoiS)] 

0 

1 

BolQoiS)] 1 

— oc 



/Pl« 

1.) 

PiiQiiS)] 1 

- ^ 

(A:197) 

Eo** 


Po[Qi(S)] 

oc 

(A:198) 

El* 

/ POr* 

is) 

PoiQoiS)] 1 

— cc 

Vpiu 

“ PdQoiS)] 

Po[Qi(S)] 


and 

(A: 199) 



El** 

/ POrJ 

1.) 

CL 

\pln 

PdQiiS)] 1 

- & 


1 E’,(n|5) denotes the expected value of n when Hi is tiue (0 — e,) and the sequen 
Lial test S is used. 



EFFICIENCY OF SEQUENTIAL PROBABILITY RATIO TEST 197 


To prove the optimum property of the sequential probability ratio 
test, we shall first derive two lemmas. 

Lemma A. 2. For any random variable u the inequality 

(A:200) S £(e“) 

holds. 

Proof. Inequality (A:200) can be written as 
(A:201) 1 ^ £'(«“') 

where u' = u — E{u). Lemma A.2 is proved if we show that (A:201) 
holds for any random variable u' with zero mean. Expanding in a 
Taylor series around u' = 0, we obtain 

(A:202) c“' = 1 + w' -h 

where |(u') lies between 0 and u' . Hence 

<A:203) E(e“') = 1 -f ^ 1 

and lemma A.2 is proved. 

Lemma A. 3. Let S be a sequential test such that there exists a finite 
integer N udth the property that the number n of observations required for 
the test is ^ N. Then ^ 

Ei (log — I s) 

(A:204) EM \ S) = (i = 0, 1) 

Eiiz) 

The proof is omitted, since it is essentially the same as that of 
equation (A:69) for the sequential probability ratio test. 

On the basis of lemmas A.2 and A. 3 we shall be able to derive the 
following theorem. 

Theorem; Let S be any sequential test for which the probability of an 
error of the first kind is a, the probability of an error of the second kind 
is 13, and the probability that the test procedure will eventually terminate 
is equal to 1 . Then 

(A:205) Eoin I S) ^ [ (1 — a) log h at log 1 

Eo(z) L 1 — O' a J 

and 

(A:20G) Ei(n \ S) ^ \ (3 log — h (1 — P) log ^1 

£- 1 ( 2 ) L 1 — a ofj 

* The validity of (A;204) has been established under very general conditions 
even when the probability that n > is positive for any A^. See the author’s 
article, ^*%Sonic Generalizations of the Tlioory of Cumulative Sums,” The Annals oj 
Mathematical StatiaticSf Vol. 16 (1945), and t). Blackwell, an Equation of 

Wald/^ The Annals of Malhematzc/il Slalzstirs, Vol. 17 (1946). 


198 


APPENDIX 


Proof. First we shall prove the theorem in the case when there 
exists a finite integer N such that n never exceeds N, According to 
lemma A. 3 we have 


(A;207) Eain \ S) = flog - 

^ 0 ( 2 ) v V 

- (■« 


(A:208) | S) = flog ^ | s) 

Exiz) \ Pon / 


5 ) + « ^ 


Po 


0 **( 


log^lS 

Pan 


)l 


f 

Ex ( 2 ) L 


( log 


Po 


1^) 


+ (1 - ^)Ex**[\oz 


Vo 


;i^)] 


From equations (A:196) through (A:199) and lemma A.2 we obtain 
the inequalities 


(A :209) * f log — I -s') g log — ^ 

\ POn / 1 — 


(A;210) E 


•••( 


log 


Po 


= l<s) 

n ' 


^ log 


1 - ^ 


(A:211) ("log — I = — Fi* (^log — I 5 ') ^ log^— — 

\ Pxn / POn / P 

and 

(A;212) Ex’^* ("log — | = —Ex** ("log — | 5^) % log — ^ 

\ 2>in. / V POn / 1 — 


Pi 


Since Eq{z) < 0, (A:205) follows from (A:207), (A:209), and (A:210). 
Similarly, since Ex{z) > O, (A:206) follows from (A:208), (A:211), and 
(A:212). This proves the theorem when a finite integer N exists such 
that n ^ N, 

To prove the theorem for any sequential test S of strength (<x, 0), 
let S^r be the sequential test we obtain by truncating S at the Ath 
observation if no decision is reached before the Ath observation. Let 
(aw, be the strength of S^. Then we have 

(A:213) Eoin \ S) ^ Eo{n | Sn) 


1 r p 

^ (1 — aw) log 

Eq (z) L 1 — 


iSw . , 1 - ^w 

h a.v log 

— oc]^ aw 



OPTIMUM WEIGHT FUNCTIONS FOR SIMPLE HYPOTHESES 199 


and 

(A:214) 1 *5) ^ E\i7i\ Sn') 





0N log 



+ (1 — /Sat) log 



Since lim = a and lim /S at = /3, inequalities (A:205) and (A:206) 

follow from (A:213) and (A:214). Hence the proof of the theorem is 
completed. 

If for the sequential probability ratio test S' the excess of the cumu- 
lative sum Zn over the boundaries log A and log B is 0, Eo{n S') is 
exactly equal to the right-hand member of (A:205) and E-yin S') is 
exactly equal to the right-hand member of (A:206). Hence, in this 
case, S’ is exactly an optimum test. If both | Eiz) 1 and Cz are small, 
the expected value of the excess over the boundaries will also be 
small and, therefore, I 'S') and ^i(n|S') will be only slightly 

larger than the right-hand members of (A:205) and (A:206), respec- 
tively. Thus, in such a case, the sequential probability ratio test is, 
if not exactly, very nearly an optimum test.® 

If dx approaches 0o> then the ratios of the upper limits of E^in | S') 
and as implied by (A;77) and (A:78), to the right-hand 

members of (A:205) and (A:206), respectively, converge to 1. Thus, 
the efficiency of the sequential probability ratio test, if not exactly 1, 
converges to 1 when By — » The upper bounds for E^in \ S') and 

E\in 1 S') given in (A:77) and (A:78) determine lower bounds for the 
efficiency of the sequential probability ratio test *S'. 


A.8 DETERMINATION OF AN OPTIMUM WEIGHT FUNCTION «;(«) IN 
SOME SPECIAL CASES OF TESTING SIMPLE HYPOTHESES WITH 
NO RESTRICTIONS ON THE POSSIBLE ALTERNATIVE VALUES OF 

THE PARAMETERS 

A, 8.1 A Class of Cases for Which an Op timum Weight Function u/(6) 
Can Be Determined by a Simple Procedure 

Let ( 01 , ■ • Ok) = (01°, • • 6k^ be the simple hypothesis Hq to be 

tested and denote the distribution of x by fix. By, • • •, 6^). Assume 
the boundary of the zone ojr of preference for rejection is a surface in 
the parameter space and denote it by Sr. Assume, further, that it is 

® The author conjectures that the .sequential probability ratio test is exactly an 
optimum test even if the excess of Zn over the boundaries is not 0. However, he 
did not succeed in proving this. 

* For the definition of the efficiency of a sequential test see Section 2.4.1. 


APPENDIX 


200 


possible to find a non-negative function viff) of the parameter 6 such 
that the surface integral ^ 

(A:215) (v{e)dS=l 

JSr 


and the sequential probability ratio test based on the ratio 


(A:216) 


Pin 

POn 



• • •> ^fc) • • • /(^n, ^1, • * *, dS 


/(^l, 


o 


• • • f(Xn, ^ 1 °, 




satisfies the following two conditions (for any values A and B): (1) 
The probability /3(0) of committing an error of the second kind (of 
accepting Ho when 0 is true) is constant over the surface Sri (2) for 
any point 0 in the interior of cor, the value of /3(^) does not exceed the 
constant value of /3(&) on the surface Sr. 

We shall now show that v(0) may be regarded as an optimum weight 
function in the sense defined in Section 4.1.3, and the probability ratio 
test based on the ratio (A:21G) provides a solution to our problem. 
In fact, the weight function v(0) over the surface Sr can be considered 
a limiting case of a weight function w(0) which takes the value 0 for 
any 0 in the interior of a>r whose distance from the boundary exceeds 
some positive A, with A approaching 0 in the limit. It follows from 
conditions (1) and (2) that for the weight function v(0) the maximum 
of /3(0) in tvr is equal to the weighted integral of /3(0), i.e., to 

j ^/3(0)v(0) dS. Consider now any other weight function w*(0) and 

Sr 

denote the resulting probability of an error of the second kind by 
/3*(0) when w*(0) is used instead of v(0). It has been sho\vn in Section 
4.1.3 that the following relations hold with sufficient approximation 
for practical purposes: 


(A:217) 



Hence the maximum of /3*(0) in a>r is 3(A — 1)/(A B). The 

optimum property of the weight function v{e) follows then from the 
fact that the maximum of vie) is equal to BiA — 1)/(A — B). 

In several important statistical problems one can easily find a weight 
function vie) such that conditions (1) and (2) are fulfilled. We shall 
show, for example, that such a weight function via) can easily be de- 
termined for testing the means of normally distributed variables with 


1 dS denotes the infiiutesimal surface element. 


OPTIMUM WEIGHT FUNCTIONS FOR SIMPLE HYPOTHESES 201 


known variances. After the weight function t;(0) has been found, for 
practical purposes we may let A = (1 — $)/ot and B = /S/(l — a) 
where a is the required value of the probability of an error of the first 
kind and /3 is the required upper limit for /3(0). 

Although we have so far assumed that X is a single random vari- 
able, all the results remain obviously valid when X is a random vector, 
i.e., X represents a set of p (p > 1) random variables Xi, • • Xp. 
The only change in the formulas is that the c^th observation Xa will 
have to be replaced by a set (xi^, • • Xpa) of p values where repre- 
sents the ath observation on X*-. 


A.8.2 Application to Testing the Means of Independently and Nor- 
mally Distributed Random Variables with Known Variances 

Let Xi, * • Xfc be k normally and independently distributed ran- 
dom variables with a common known variance a^. The mean values 
Ok are assumed to be unknoNvn. Suppose that it is required 
to test the hypothesis that (0i, - • *, ^0 = (^i®, • * *, 0}^). Assume that 
the zone <Or of preference for rejection is given by 


(^1 — ^1°)^ ^ 5<r 


where 6 is some given positive value. Then the boundary Sr of o>r is 
a sphere with center 6^ = (^i®, • • dk^) and radius 5cr. Let v{6) be 
constant over Sr and equal to the reciprocal of the area of Sr- We 
shall show that for this weight function conditions (1) and (2) of the 
preceding section are fulfilled. For this purpose, we shall first prove 
that the ratio (A:216) is a monotonically increasing function of 
(xi — 01®)^ + • • ♦ + (x* — $k^)^ where x,- is the arithmetic mean of the 
observations on X,-. In fact, in our case the ratio (A:216) reduces to 


c I e dS 


(A:218) 


X; 










whe re c is equal to the reciprocal of the area of Sr- Let r* denote 

and let p(9) (0 ^ p ^ tt) denote the angle be- 


>/r 


i£i - 0 .-®) 


0\2 


tween the vector (xi — 0i^, — ' , Xk — 0*®) and the vector (0i — 0^^, 
‘ 0k — Then (A:218) can be written as 



COG [p(«)] 


(A:219) 


202 


APPENDIX 


Because of the symmetry of the sphere, the value of (A:219) will not 
be changed if we substitute yid) for p(^), where y{d) (0 ^ y ^ tt) de- 
notes the angle between the vector 0 — and an arbitrarily chosen 
fixed vector u. From this it follows that the value of (A:219) depends 
only on r*. 

Now we shall show that (A:219) is a strictly increasing function of 
r*. For this purpose we merely have to show that 


(A:220) 


/(r*) 




jiTx 6 COO [>(9)1 



is a strictly increasing function of r*. We have 


(A:221) ^ r„a cos (tWIc "’'*' dS 

dr^ JSr 

Denote by S'r the subset of Sr in which 0 ^ yiO) ^ 7r/2, and by S*'r 
the subset in which 7r/2 < y (O') ^ w. Because of the symmetry of the 
the sphere we have 


(A ;222) r n 5 cos dS 

JS’'r 


=I 


n S cos [tt 




Hence 


(A:223) 


dl^ 

dvx 


- -L. 


— I n 6 cos [yi0)]e 


^nrg 


coe lyiS)] 


nsf cos [t w](e" - e”" dS 

Js‘r 


The right-hand side of (A:223) is positive. Hence, we have proved 
that expression (A:219) or (A:218) is a strictly increasing function of 
We shall now show that is constant over any sphere Sr{d) 

with center 6^ and radius d and that it decreases monotonically ^vith 
increasing d. For this purpose let •••, 2 /fc be an orthogonal 

linear transformation of xi — * * *, Xk — so that E{yi) = 

^/(di — 01®)^ d H ~ and E{yi) = O (f = 2, • • • , Since 

„l 1 _ ^^2 = _ 0^0)2 ^ 1 - (xfc — 0*0)2 and since (A:219) 

depends only on (xi - 0i®)=^ H h (x* — 0*®)^, it is seen that the 

sequence of expressions (A:219) formed for the sequence of integers 
^ — s 2, • • etc., has a joint distribution which depends only on 



OPTIMUM WEIGHT FUNCTIONS FOR COMPOSITE HYPOTHESES 203 


V (01 — 01 °)=^ H h idk — Hence /3(e) is constant on any 

sphere Sr(d). Since (A:219) is a strictly monotonic function of r^., it 
can be sho^vn that ^(6) is monotonically decreasing with increasing d. 
Hence, conditions (1) and (2) of the preceding section are fulfilled and 
we can test the hypothesis that $ = 6^ by the sequential probability 
ratio test based on the ratio (A:218). 

If A: = 1, i.e., if we test the mean value of a single random variable 
X, the sphere *Sr is a null-dimensional sphere consisting of the two 
points ^1 = 6<t and $2 = — 5<r and (A:216) reduces to the ratio of pin 
to pon given by (4:8) and (4:9), respectively, in Section 4.1.4. 


A.9 DETERMINATION OF OPTIMUM WEIGHT FUNCTIONS u»a(e) AND 
w;r(e) IN SOME SPECIAL CASES OF TESTING COMPOSITE HYPOTHESES 

A.9.1 A Class of Cases for Which Optimum Weight Functions 
and Wr(Q) Can Be Determined by a Simple Procedure 

Let/(j:, Oi, • • •, 6k) denote the distribution of x involving k unknown 
parameters • • •, Suppose we wish to test the composite hypoth- 
esis Hu, that the parameter point 6 lies in the subset <*> of the parameter 
space. Let a»o denote the zone of preference for acceptance and a>r the 
zone of preference for rejection. Assume that the boundary of is 
a surface Sr. Suppose that it is possible to find two weight functions 
Va(d) and t>r(0) such that 



and that the sequential probability ratio test based on the ratio 


(A:224) 


Pin 

POn 


f Vr(d)\_\^f(Xa, di, • • - , dk) dSr 
«« 1 

Va(e)\_\f(Xa, 6i, • - Ok) de 

a = l 


satisfies the following conditions (for any values A and B): (1) (x($) is 
constant in (2) (3(6) is constant over Sr', (3) for any point 6 in the 
interior of a>r, the value of 0(6) does not exceed the constant value of 
0(6) on Sr. 

We shall now show that Va(6) and Vr(d) may be regarded as optimum 
weight functions in the sense defined in Section 4.2.2. For this pur- 
pose, let Wa(6) and Wr(6) be any other weight functions and let ct*(6) 



APPENDIX 


204 

and be the resulting probabilities of errors of the first and second 

kinds when WaiO) and Wr{6) are used. Since, as has been shown. 


(A:225) 

and 



hold with good approximation, we see that in wa the maximum of 
ot*{d) ^ (1 — B)/{A — and in wr the maximum of ^*(0) ^ 

B{A — 1)/(A — B) with good approximation. But if Va(0) and 
Vr(6) are used, it follows from conditions (1), (2), and (3) that (with 
good approximation) the maximum of oe(0) in wa is equal to 
(1 — B}/(A — B) and the maximum of 0($) in Mr is equal to 
B(A — 1)/(A — B), Hence these weight functions are optimum in 
the sense defined in Section 4.2.2. 

In some special but important statistical problems one can easily 
find weight functions Va(0) and Vj.(B) which satisfy conditions (1), (2), 
and (3). It will be seen in the next section that such weight functions 
can easily be constructed when the mean of a normal distribution with 
unknown variance is being tested. Again, for practical purposes we 
may let A = (1 — jff)/a and B = p/(l — a), where a is the required 
upper bound of a{e) in and ^ is the required upper bound of 
in 


A.9.2 Application to Testing the Mean of a Normal Distribution with 
Unknown Variance (Sequential f-Test) 


Let X be a normally distributed random variable with unknown 
mean $ and unknown variance Suppose we wish to test the hy- 

pothesis that 9 = ^ 0 - Furthermore, assume that Mr is given by the set 

' 9 - 90 ' 


of all points (0, o') for which 


^ d, while Ma consists of all 


points {9o, 0 -). Then the boundary *Sr of Mr consists of all points (9, o-) 
for which = 5, i.e., it contains the points (9, <t) for which 


either 9 = 9q da ov 9 — 

For any positive value c we define the weight functions Vae(<r) 
yrc(o-) as follows: = 1/c if 0 ^ o- ^ c and equals O for all other 

values of a. The weight function WrcCo-) is equal to l/2c if 0 ^ o- ^ c 
and 9 = 9o and equal to 0 otherwise. Let 



OPTIMUM WEIGHT FUNCTIONS FOR COMPOSITE HYPOTHESES 205 


(A:226) 

Pin 




n 


da 


(2x)2<r 


(2ir) 


a 


and 

(A:227) 

Then 


Ilf 

Von — " c Jn 

(2^)2 ^ 




(A:228) 


Pi 


£ 




POn 


i 




We consider the limiting case when c — > » 


(A:229) 


Pi 


n 


1 f" J_ 

2 Jo a" 


Thus 

(e~ ^ 5^, 


2 (*a — eo+«<')* 


) da 


Po 


n 


*/o a” 




The sequential probability ratio test based on the ratio (A:229) pro- 
vides a solution to our problem if it can be shown to have the follow- 
ing three properties: (1) <x{0, a) is constant in a>a; (2) ^(0, a) is a func- 

9 •• 00 

‘ alone; (3) /3(0, a) is monotonically decreasing with 


tion of 


increasmg 


a 

0-00 


n 


To prove these three properties, let x denote 


z 

er ^ 1 


X 


a 


and denote 


n 


^(Xoi — x)^. Since the joint distribution of a sequence of expressions 

X — 00 

— - — corresponding to consecutive values of n depends only on 

o 

0 00 j 

, the first two properties are proved if we show that the ratio 

^ — 00 


(A:229) is a single-valued function of 


5 


206 


APPENDIX 


First we show that the numerator of the ratio (A:229) is a homo- 
geneous function of (xi — do, X 2 — ^o» * * — ^o) of degree 

— (n — 1). In fact, making the transformation <t = \t we obtain 





1 

2ir2 


Z(Xx« — X«o— 


1 


) otr 


f” 1 
Jo (XO" 



1 

2t> 


r(x«— «o— 


+ e 


- ^ 2(x«-eo+«) 


*) dKt 




1 

2f3 


Z(*a— do— iO* 


H- e 


- Z(x«-do+«0* 


2<3 


) di 


This proves that the numerator of (A:229) is a homogeneous function 
of xi — ^ 0 . * ■ 'j — Bq of degree — (n — 1). Similarly, it can be 
sho^vn that the denominator of (A:229) is also a homogeneous func- 
tion of degree — (n — 1). Thus, the ratio (A:229) is a homogeneous 
function of zero degre© hi the variables Xi — ^o» * * * > — Bq. 

It can be verified that (A:229) is a function of only the two expres- 
sions S(xa — 0o)^ Q-iid 2(xa — do), i.e., 

(A:230) — = 0[S(x„ — do)^, S(x« — do)I 

POn 


Let V = I VS(x« — do)^ |. Since (A:230) is a homogeneous function 
of zero degree in xi - do, - - x„ - do, its value is not changed by 
substituting (xa — do)/y for x* — do- Hence 


(A:231) 


Pin 

pOn 




- 9o Y - gp) 

V / * V 





Since ^[S(xa 
see that 


- Oo)^, - So)] = - «o)], 



we 


Since 


(x - do) 


is a single-valued function of 


X — d 


o 


V 


S 


we have proved 


^Lat is a single-valued function of 

POn 



Hence properties (1) 


and (2) are proved. 

In order to prove property (3) of the sequential probability ratio 
test based on the ratio (A:229), it is sufficient to show that (A:229) 



OPTIMUM WEIGHT FUNCJTIONS FOR COMPOSITE HYPOTHESES 207 


is a strictly increasing function of 


Hi 

! 1 

o 

. Since 

X — 00 





IS a 


strictly increasing function of , we have only to show that 

(A:229) is a strictly increasing function of The latter 


statement is proved if we show that (A:229) increases with increasing 
value of I X — 00 1 while v is kept fixed. For a fixed value of v the 
denominator of (A:229) is constant. Thus, we merely have to show 
that the numerator of (A:229) increases with increasing | x — 0o 1 
while V is kept fixed. T his follows easily from the fact that 


(S^0o)6 ^ (x— 

e ^ + e 


is a strictly increasing function of x — 0o • 


Title 


Author 


Accession No 

Call Ntr 


Borrower's 

No. 



INDEX 


Acceptance inspection, for classification 
as defective or non-defective, 88 
for specified upjjer limit of mean of 
quality characteristic, 117 
for specified upper limit of variability 
of quality characteristic, 125 
Acceptance number for sequential prob- 
ability ratio test, of binomial dis- 
tribution, 92 

of double dichotomies, 111 
of mean of normal distribution, 120, 
137 

of standard deviation of normal dis- 
tribution, 127 
Arnold, K., 84 

ASN function, see Average sample 
number function 

Average sample number function, 25 
as basis for selection of sequential 
sampling plan, 33, 143 
derivation of approximation formula 
of, 52, 170 

exact formula of, 182 
increase in, due to approximate values 
of A and B, 65 
of multi-valued decision, 141 
of test of binomial distribution, 99 
of test of double dichotomies, 114 
of test of mean of normal distribu- 
tion, 123 

of test of standard deviation of nor- 
mal distribution, 131 
upper and lower limits for, 172 
Averaging function, 148 

Barnard, G. A., 4 
Bautky, Walter, 1 
Binomial distribution, 88 
test of mean of, 88 
5^ and rio for, 164 
fff and for, 179 
Birnbaum, Z. W., 169 
Blackwell, I., I97n 
Brown, George W., 3, 48n 


C.d.f., see Cumulative distribution func- 
tion 

Comparison of two production proc- 
esses, when quality characteristic 
has binomial distribution, 106 
when quality characteristic has nor- 
mal distribution, 86 

Comparison of two sequential tests, 34 
Confidence coefficient, of interval, 151 
of region, 152 
Confidence interval, 151 
Confidence region, 152 
Conjugate distribution, 176 
Critical region, 14 
choice of, 16 
most pow’erful, 17 
power of, 17 
size of, 17 

uniformly most powerful, 20 
Cumulative distribution function, 6 
continuous, 8 
step function, 8 

Cumulative sum Z„, exact distribution 
of, 181 

Curtiss, J. H., 92n 
Density function, 0 

Distribution, of a random variable, 10 
Dodge, H. F., 1 
Double dichotomies, 106 

classical test, procedure for, 107 
exact non-sequential test for, 107 
sequential U^st procedure for, 109 

Effective units, 106 

Efficiency, of current test procedure, 35 
of producti<jn process, 109 
of sequential probability ratio test 
199 

of sequential test, 34 
Error of the first kind, 16 
weighted Hverag<j of, 81 
ICrrnr of the second kind, 16 
weighted average of, 74, 81 


209 


210 


INDEX 


Error weight functions, for multi-valued 
decision, 142 
for test of hypothesis, 28 
simplified form of, 144 
Estimation, current theory of, 151 
sequential procedure of, 163 
Expected value, 11 
mean, 11 
moments, 11 
variance, 11 

Fisher, R. A., 107 
Freema-n, Harold, 3n 
Friedman, Milton, 2, 48n 
Fundamental identity, 159 

Girshick, M. a., 84, 98, 133n, 182, 191n 
Girshick's problem, 84 
Graphical procedure for sequential prob- 
ability ratio test, of binomial dis- 
tribution, 93 

of double dichotomies, 111 
of mean of normal distribution, 120 
of standard deviation of normal dis- 
tribution, 128 

Grouping, effect of, in tests of binomial 
distribution, 101 

in tests of double dichotomies, 116 
on OC and ASN curves, 103 

Hotellino, Harold, 2 
Hypothesis, see Statistical hypothesis 

Intercepts of acceptance and rejection 
lines of sequential probability 
ratio test, of binomial distribu- 
tion, 94 

of double dichotomies, 113 
of mean of normal distribution, 121 
of standard deviation of normal 
distribution, 129 

Mahalanobis, P. C., 2 
Mean value of a random variable, 11 
Moments of a random variable, 11 
Multi-valued decision, 138 

error weight functions for, 142 
sequential sampling plan for, 139 
ASN function of, 141 
class C, 146 


Multi-valued decision, sequential sam- 
pling plan for, operating charac- 
teristics of, 141 
optimum, 143n 
risk function of, 143 

Neyman, Jbrey, 15, 76n, 151 
Neyman-Pearson theory of tests of hy. 

potheses, 16 
Normal distribution, 10 
current test of mean of, 18 

number of observations required 
by, 54 

test of difference of two standard 
deviations, 86 

test of mean of, with known variance, 
77, 134 

with unknown variance, 83, 204 
test of means of several independent 
normal variables, 201 
test that mean of, is below given 
value, 80, 117 

test that standard deviation of, is 
below given value, 125 
de and ve for, 165 
and for, 179 

Observations, dependent, 43 
from finite population, 13, 43 
independent, 13 

joint probability distribution of, 14 
OC function, see Operating character- 
istic function 

Operating characteristic function, 24 
derivation of approximation formula 
of, 48, 116 

exact formula of, 182 
of test of binomial distribution, 95 
of test of double dichotomies, 113 
of test of mean of normal distribu- 
tion, 122 

of test of standard deviation of normal 
distribution, 129 
requirements imposed on, 31 
upper and lower limits for, 162 
Operating characteristics, of multi 
valued decision, 141 

Parameter, of a distribution, 11 
Parameter point, 24 



INDEX 


211 


Parameter point, importance of wrong 
decision as function of, 27, 142 
Parameter space, 28 

subdivision of, into three zones, 28 
Pearson, Egon S., 15, 76a 
Population, 7 
finite, 7 
infinite, 8 

Probability density fxmetion, 9 
Probability distribution, 10 
joint, 14 

Quality control, to maintain production 
standard, 134 

when upper limit of mean of qual- 
ity characteristic is specified, 117 
when upper limit of variability of 
quality characteristic is specified, 
125 

Random selection, 5 
Random variable, 5 

cumulative distribution function of 6, 
discrete, 10 

probability distribution of, 10 
Rejection number of sequential prob- 
ability ratio test, of binomial dis- 
tribution, 92 

of double dichotomies, 111 
of mean of normal distribution, 120, 
137 

of standard deviation of normal dis- 
tribution, 128 
Risk function, 142 

as basis of selection of sequential 
sampling plan, 143 
Romig, H. G., 1 

Sample, 13 
effective, 23 
ineffective, 23 
of type 0, 40 
of typo 1, 41 

Sample number n, approximate char- 
acteristic function of, 186 
approximate distribution of, 101 
approximate moments of. 189 
exact characteristic function of, 191 
exact moments of, 191 
Sample space, 22 


Saving in number of observations by 
use of sequential probability 
ratio test, 54 

Schuyler, Captain G. L., 2 
Sequential estimation, 153 
Sequential probability ratio test, 37 
applications of, 88 
ASN function of, 52 
determination of constants of, in 
practice, 44 
efficiency of, 199 
for binomial distribution, 90 
for dependent observations, 43 
for double dichotomies, 110 
for normal distribution, testing means 
of several independent variables, 
201 

testing that mean equals specified 
value, variance being known, 77, 
134 

testing that mean equals specified 
value, variance being unknown, 
83, 204 

testing that mean is below given 
value, 118 

testing that standard deviation is 
below given value, 126 
fundamental identity, 159 
fundamental inequalities among con- 
stants of, 42 

increase in number of observations 
needed by, due to approximate 
A and B, 65 

independence of, from distribution 
problems, 48 
OC function of, 48 
optimum property of, 196 
probability of termination of, before 
fixed number of trials, 58 
procedure for, 38 
termination of, 157 
truncation of, 61 

Se<iuential sampling plan, for multi- 
valued decision, 139 
Sequential test, 22 
admissible, 32 

ASN function as basis for selection 
of, 33 

ASN function of, 25 
comparison of two tests, 34 


212 


INDEX 


Sequential test, current test procedure 
as particular case of, 35 
efficiency of, 35 
OC function of, 24 
optimum, 36 

principles for selection of, 27 
strength of, 34 
uniformly best, 34 
Sequential /-.test, 83, 204 
Slope of acceptance and rejection lines 
of sequential probability ratio 
test, of binomial distribution, 94 
of double dichotomies, 113 
of mean of normal distribution, 121 
of standard deviation of normal dis- 
tribution, 128 
Standard deviation, 11 
Statistical hypothesis, 1 1 ; see also Test 
of statistical hypotlicsis 
alternative, 16 

approximation of composite hypoth- 
esis by simple hypothesis, 71 
composite, 13 
null, 16 
simple, 13 

Statistical Research Group, Columbia 
University, 2, 88n 
Stein, C., 133n. 153n 
Stockman, C. M., 3, 48n 
Strength of test procedure, 34 

Table, of average percentage saving in 
size of sample, 57 

of effect of truncation on risks of 
error, 64 

of increase in expected number of ob- 
servations duo to approximation 
of test constants, 68 
of lower bound of probability that 
sequential analysis will terminate 
within given number of trials, 60 
Tabular procedure for sequential prob- 
ability ratio test, of binomial 
distribution, 92 
of double dichotomies. 111 
of mean of normal distribution, 120 
of standard deviation of normal dis- 
tribution, 127 

Termination of sequential probability 
ratio tost, 157 


Test of composite hypothesis, 80 

class C of sequential probability ratio 
tests, 82 

Girshick's problem, 84 
special case of, testing that unknown 
parameter is below given value, 
78 

weight functions for, 81 
Test of simple hypothesis, 70 

class C of sequential probability ratio 
tests, 76 

weight functions for, 74 
with no restrictions on alternatives, 
73 

with one-sided alternatives, 72 
Test of statistical hypothesis, 14 

as decision between two courses of 
action, 20 

as special case of multi-valued deci 
sion problem, 139 

comparison between current and se- 
quential procedure for, 35, 54 
Neyman-Pearson theory of, 16 
number of observations required b 
20 

sequential procedure for, 22 
Truncation, 61 

effect on risks of error, 64 
for binomial distribution, 104 

Universe, 7 

Variance, 11 

Wald, Abraham, 3n 
Wallis, W. Allen, 2, 176n 
Weight functions, for test of composit. 
hypothesis, 81 
choice of, 82 
optimum, 203 

for test of simple hypothesis, 74 
choice of, 76 
optimum, 200 

Zone of preference for acceptance, for 
multi-valued decision problem, 
144 

for test of hypothesis, 28 
Zone of preference for rejection for test 
of hypothesis. 28 



A CATALOGUE OF SELECTED DOVER BOOKS 

IN ALL FIELDS OF INTEREST 


A CATALOGUE OF SELECTEE) DOVER BOOKS 

IN ALL FIELDS OF INTEREST 


America’s Old Masters, James T. Flexner. Four men emerged unexpectedly 
from provincial 18th century America to leadership in European art: Benjamin 
West, J. S. Copley, C. R. Peale, Gilbert Stuart. Brilliant coverage of lives and con- 
tributions. Revised, 1967 edition. 69 plates. 365pp. of text. 

21806-6 Paperbound $3.00 

First Flowers of Our Wilderness: American Painting, The Colonial 
Period, James T. Flexner. Painters, and regional painting traditions from earliest 
Colonial times up to the emergence of Copley, West and Peale Sr., Foster, Gustavus 
Hesselius, Feke, John Smibert and many anonymous painters in the primitive manner. 
Engaging presentation, with 162 illustrations, xxii + 368pp. 

22180-6 Paperbound $3.50 

The Light of Distant Skies: American Painting, 1760-1835, James T. Flex- 
ner. The great generation of early American painters goes to Europe to learn and 
to teach: West, Copley. Gilbert Stuart and others. Allston, Trumbull, Morse; also 
contemporary American painters — primitives, derivatives, academics — who remained 
in America. 102 illustrations, xiii + 306pp. 22179*2 Paperbound $3.50 

A History of the Rise and Progress of the Arts of Design in the United 
States, William Dunlap Much the richest mine of information on early American 
painters, sculptors, architects, engravers, miniaturists, etc. The only source of in- 
formation for scores of artists, the major primary source for many others. Unabridged 
reprint of rare original 1834 edition, with new introduction by James T. Flexner, 
and 394 new illustrations. Edited by Rita Weiss. 6% x 9^. 

21695-0, 21696-9, 21697-7 Three volumes, Paperbound $15.00 

Epochs of Chinese and Japanese Art, Ernest F. Fenollosa. From primitive 
Chinese art to the 20th century, thorough history, explanation of every important art 
period and form, including Japanese woodcuts; main stress on China and Japan, but 
Tibet, Korea also included. Still unexcelled for its detailed, rich coverage of cul- 
tural background, aesthetic elements, diffusion studies, particularly of the historical 
period. 2nd, 1913 edition. 242 illustrations, lii + 439pp. of text. 

20364-6, 20365-4 Two volumes, Paperbound $6.00 

The Gentle Art of Making Enemies, James A. M. Whistler. Greatest wit of his 
day deflates Oscar Wilde, Ruskin, Swinburne; strikes back at inane critics, exhibi- 
tions, art journalism; aesthetics of impressionist revolution in most striking form. 
Highly readable classic by great painter. Reproduction of edition designed by 

Whistler Introduction by Alfred Werner, xxxvi -j- 334pp. 

21875-9 Paperbound $3.00 



CATALOGUE OE DOVER BOOKS 


Visual Illusions: Their Causes, Characteristics, and Applications, Mat- 
thew Luckiesh. Thorough description and discussion of optical illusion, geometric 
and perspective, particularly ; size and shape distortions, illusions of color, of motion ; 
natural illusions; use of illusion in art and magic, industry, etc. Most useful today 
with op art, also for classical art. Scores of effects illustrated. Introduction by 
William H. Ittleson. 100 illustrations, xxi d- 252pp. 

21530-X Paperbound $2.00 


A Handbook of Anatomy for Art Students, Arthur Thomson. Thorough, vir- 
tually exhaustive coverage of skeletal structure, musculature, etc. Full text, supple- 
mented by anatomical diagrams and drawings and by photographs of undraped 
figures. Unique in its comparison of male and female forms, pointing out differences 
of contour, texture, form. 211 figures. 40 drawings, 86 photographs, xx -f- 459pp. 
554 x 8 ^ 8 - 21163-0 Paperbound $3.50 

150 Masterpieces of Drawing, Selected by Anthony Toney. Full page reproduc- 
tions of drawings from the early l6th to the end of the 18th century, all beautifully 
reproduced: Rembrandt, Michelangelo, Diarer, Fragonard. Urs, Graf, Wouwerman, 
many others. First-rate browsing book, model book for artists, xviii -j- 150pp. 

8 ^x 1114 . 21032-4 Paperbound S 2. 50 

The Later Work of Aubrey Beardsley. Aubrey Beardsley. Exotic, erotic, 
ironic masterpieces in full maturity: Comedy Ballet. Venus and Tannhauser Pierrot 
Lysistrata, Rape of the Lock. Savoy material, Ali Baba. Volpone, etc. This material 
revolutionized the art world, and is still powerful, fresh, brilliant. With The Early 
iVork, all Beardsley's finest work. 174 plates, 2 in color, xiv + 176pp. SVs x 11 . 

21817-1 Paperbound $3.00 

Drawings of Rembrandt. Rembrandt van Rijn. Complete reproduction of fabu- 
lously rare edition by I ippm.,nn and Hofstede de Groot, completely reedited, up- 
dated, improved by Prof. Seymour Slive, Fogg Museum. Portraits, Biblical sketche.s, 
landscapes. Oriental types, nudes, episodes from classical mythology — All Rcm- 
brandts fertile genius. Also selection of drawings by his pupils and followers. 
Stunning volumes. Sa/»r^ay Rctuu. 550 illustrations. Ixxviii -f- 552pp. 

^ 214^5-0. 21486-9 Two volumes, Paperbound $10.00 

Goya. One of ll,c masterpieces of Western civi- 
liaation 83 etchings that record Goya’s shattering, bitter rc.iction to the N.ipolconic 
war th. 1 t swept through Spam after the insurrection of 1808 and to war in general 

FW Art AM r““"’ Phites from Boston s Museum of 

V d! 0700 "''j sire. Introduction by Piiilip Hofer, Fogg Museum. 
T J/pp. 9/8x814. 21872-4 Paperbound $ 2.00 

fve/.'s"'' Largest collection nf Redon s gr.rpbic works 

^ ^ f t! Lthographs, 28 etebings ,md engravings. 9 drawings. These 

nclude some of his most famous works. AH the plates from Redo,,: oonre 

hv Ai/'T ■ P'‘'^“dd.tional plates. New Introduction and captmn iranslations 

by Alfred Werner. 209 illustrations, .vxvii + 209 pp. 91^5 x 121 /, 

21966-8 Paperbound $4.50 


CATALOGUE OF DOVER BOOKS 


Design by Accident; A Book of "Accidental Effects" for Artists and 
Designers, James F. O'Brien. Create your own unique, striking, imaginative effects 
by "controlled accident" interaction of materials: paints and lacquers, oil and water 
based paints, splatter, crackling materials, shatter, similar items. Everything you do 
will be different; first book on this limitless art, so useful to both fine artist and 
commercial artist. Full instructions. 192 plates showing "accidents," 8 in color, 
viii H- 215pp. 83^ X ^ll^. 21942-9 Paperbound $3.75 

The Book of Signs, Rudolf Koch. Famed German type designer draws 493 beau- 
tiful symbols: religious, mystical, alchemical, imperial, property marks, runes, etc. 
Remarkable fusion of traditional and modern. Good for suggestions of timelessness, 
smartness, modernity. Text, vi + 104pp. 6*/^ x 9%- 

20162-7 Paperbound $1.25 

History of Indian and Indonesian Art, Ananda K. Coomaraswamy. An un- 
abridged republication of one of the finest books by a great scholar in Eastern art. 
Rich in descriptive material, history, social backgrounds; Sunga reliefs, Rajput 
paintings, Gupta temples, Burmese frescoes, textiles, jewelry, sculpture, etc. 400 
photos, viii + 423pp. 63/8 X 93^. 21436-2 Paperbound $5.00 

Primitive Art, Franz Boas. America's foremost anthropologist surveys textiles, 
ceramics, woodcarving, basketry, metalwork, etc.; patterns, technology, creation of 
symbols, style origins. All areas of world, but very full on Northwest Coast Indians. 
More than 350 illustrations of baskets, boxes, totem poles, weapons, etc. 378 pp. 

20025-6 Paperbound $3 00 

The Gentleman and Cabinet Maker's Director, Thomas Chippendale. Full 
reprint (third edition, 1762) of most influential furniture book of all time, by 
master cabinetmaker. 200 plates, illustrating chairs, sofas, mirrors, tables, cabinets, 
plus 24 photographs of surviving pieces. Biographical introduction by N. Bienen- 
stock. vi + 249pp. 9^8X12%. 21601-2 Paperbound $4.00 

American Antique Furniture, Edgar G. Miller, Jr. The basic coverage of all 
American furniture before 1840. Individual chapters cover type of furniture — 
clocks, tables, sideboards, etc. — chronologically, with inexhaustible wealth of data. 
More than 2100 photographs, all identified, commented on. Essential to all early 

American collectors. Introduction by H. E. Keyes, vi 1106pp. 7^8 ^ lO^- 

21599-7, 21600-4 Two volumes, Paperbound $11..00 

Pennsylvania Dutch American Folk Art, Henry J. Kauffman. 279 photos, 

28 drawings of tulipware, Fraktur script, painted tinware, toys, flowered furniture, 

quilts, samplers, hex signs, house interiors, etc. Full descriptive text. Excellent for 

tourist, rewarding for designer, collector. Map. I46pp. 7% ^ 10^. 

21205-X Paperbound $2.50 


Early New England Gravestone Rubbings, Edmund V. Gillon, Jr. 43 photo- 
graphs, 226 carefully reproduced rubbings show heavily symbolic, sometimes 
macabre early gravestones, up to early 19th century. Remarkable early American 
primitive art, occasionally strikingly be.iutiful; always powerful. Text, xxvi + 
207pp. 83/8 X IIVa- 21380-3 Paperbound $3.50 



CATALOGUE OF DOVER BOOKS 


Alphabets and Ornaments, Ernst Lehner. Well-known pictorial source for 
decorative alphabets, script examples, cartouches, frames, decorative title pages, calli- 
graphic initials, borders, similar material. I4th to 19th century, mostly European. 
Useful in almost any graphic arts designing, varied styles. 750 illustrations. 256pp. 
7 X 10 . 21905-4 Paperbound $4.00 

Painting: A Creative Approach, Norman Colquhoun. For the beginner simple 
guide provides an instructive approach to painting: major stumbling blocks for 
beginner; overcoming them, technical points; paints and pigments; oil painting; 
watercolor and other media and color. New section on "plastic” paints. Glossary. 
Formerly Pain/ Your Ou» P/c/uf-es. 221 pp. 22000-1 Paperbound $ 1.75 

The Enjoyment and Use of Color, W^alter Sargent. Explanation of the rela- 
tions between colors themselves and between colors in nature and art, including 
hundreds of little-known facts about color values, intensities, effects of high and 
low illumination, complementary colors. Many practical hints for painters, references 
to great masters. 7 color plates, 29 illustrations, x + 274pp. 

20944-X Paperbound $2.75 

The Notebooks of Leonardo Da Vinci, compiled and edited by Jean Paul 
Richter. 1566 extracts from original manuscripts reveal the full range of Leonardo’s 
versatile genius: all his writings on painting, sculpture, architecture, anatomy, 
astronomy, geography, topography, physiology, mining, music, etc., in both Italian 
and English, with 186 plates of manuscript pages and more than 500 additional 
drawings. Includes studies for the Last Supper, the lost Sforza monument, and 
other works. Total of xlvii -f- 866 pp. 7^8 x 10 %. 

22572-0, 22573-9 Two volumes, Paperbound $ 11.00 


Montgomery Ward Catalogue of 1895. Tea gowns, yards of flannel and 
pdlow-case lace, stereoscopes, books of gospel hymns, the New Improved Singer 
Sewjng Machine, side saddles, milk skimmers, straight-edged razors, high-button 
shoes, spittoons, and on and on . . . listing some 25,000 items, practically all illus- 
trated. Essential to the shoppers of the 1890’s, it is our truest record of the spirit of 
the Period. Unaltered reprint of Issue No. 57, Spring and Summer 1895. Introduc- 
tion by Bons Emmet. Innumerable illustrations, xiii + 624pp. 8 V 2 ^ 

22377-9 Paperbound $ 6.95 


^E Crystal Palace Exhibition Illustrated Catalogue (London 1851 ) 
^e of the %vonders of the modern world— the Crystal Palace Exhibition in which 
all the nations of the civilized world exhibited their achievements in the arts and 
ciences pr«ented in an equally important illustrated catalogue. More th.m 1700 
Items pictured with accompanying text— ceramics, textiles, cast-iron work, carpets 
pianos, sleds razors, wall-papers, billiard tables, beehives. .sihersNare .ind hundreds 
^ o her artffacts— represent the focal point of Victorian culture in the Western 
world. Probably the largest collection of Victorian decorative ..rt ever assembled— 
indispensable- for antiquarians and designers. Unabridged republication of the 
rt- Journal Catalogue of the Great Exhibition of 185 1. with all terminal essays. 
New introduction by John Gloag, F.S.A. xxxiv H- 426pp. 9 x 12 . 

22503-8 Paperbound $5.00 


CATALOGUE OF DOVER BOOKS 


A History of Costume, Carl Kohler. Definitive history, based on surviving pieces 
of clothing primarily, and paintings, statues, etc. secondarily. Highly readable text, 
supplemented by 594 illustrations of costumes of the ancient Mediterranean peoples, 
Greece and Rome, the Teutonic prehistoric period; costumes of the Middle Ages, 
Renaissance, Baroque, 18th and 19th centuries. Clear, measured patterns are pro- 
vided for many clothing articles. Approach is practical throughout. Enlarged by 
Emma von Sichart. 464pp. 21030-8 Paperbound 53-50 

Oriental Rugs, Antique and Modern, Walter A. Hawley. A complete and 
authoritative treatise on the Oriental rug — where they ane made, by whom and how, 
designs and symbols, characteristics in detail of the six major groups, how to dis- 
tinguish them and how to buy them. Detailed technical data is provided on periods, 
weaves, warps, wefts, textures, sides, ends and knots, although no technical back- 
ground is required for an understanding. l 1 color plates, 80 halftones, 4 maps, 
vi + 320pp. 61/8 X 91/8- 22366-3 Paperbound $3.00 

Ten Books on Architecture. Vitruvius. By any standards the most important 
book on architecture ever written. Early Roman discussion of aesthetics of building, 
construction methods, orders, sites, and every other aspect of architecture has in- 
spired, instructed architecture for about 2,000 years. Stands behind Palladio, 
Michelangelo, Bramante. Wren, countless others. Definitive Morns H. Morgan 
translation. 68 illustrations, xii + 331pp. 20645-9 Paperbound S3.00 

The Four Books of Architecture, Andrea Palladio. Translated into every 

major Western European language in the two centuries following its publication in 

1570 this has been one of the most influential books in the history of architecture. 

Complete reprint of the 1738 Isaac Ware edition. New introduction by Adolf 

Placzck, Columbia Univ. 2 l 6 plates, xxii + llOpp. of text. 9V2 x I23/4. 

213O8-O Clothbound $ 12.50 


Sticks and Stones: A Study of American Architecture and Civilization. 
Lewis Mumford.One of the great classics of American cultural history. American 
architecture from the medieval-inspired earliest forms to the early 20th century; 
evolution of structure and style, and reciprocal influences on ec^ironment. 21 phono- 
graphic illustrations. 238pp. 20202-X Paperbound $2.00 


The American Builder’s Companion. Asher Benjamin. The most widely used 
early 19th century architectural style and source book, for colonial up into Greek 
Revival periods. Extensive development of geometry of carpentering construction 
of sashes frames, doors, stairs; plans and elevations of domestic and other buildings. 
Hundreds of thousands of houses were built according to this book, now invaluable 
,0 historians, .srchi.cas, restorers, etc. 1«27 edition. 59 plat«.^ $3,50 


Dutch Houses in the Hudson Valley Before 1776, Helen Wilkinson Rey- 
nolds. The standard survey of the Dutch colonial house and outbuildings 
structional features, decoration, and local history associated with 

steads. Introdnrtion by Franklin D. Roosevelt. Is'Si 

6ys ^ s)V4- 



CATALOGUE OF DOVER BOOKS 


The Architecture of Country Houses, Andrew J. Downing. Together with 
Vaux’s Villas and Cottages this is the basic book for Hudson River Gothic arcliitec- 
ture of the middle Victorian period. Full, sound discussions of general aspects of 
housing, architecture, style, decoration, furnishing, together with scores of detailed 
house plans, illustrations of specific buildings, accompanied by full text. Perhaps 
the most influential single American architectural book. 1850 edition. Introduction 
by J. Stewart Johnson. 321 figures, 31 architectural designs, xvi + 560pp. 

22003-6 Paperbound S-l.OO 

Lost Examples of Colonial Architecture, John Mead Howells. Full-page 
photographs of buildings that have disappeared or been so altered as to be denatured, 
including many designed by major early American architects. 245 plates, xvii 
248pp. 7^ X 21143-6 Paperbound S3. 50 

Domestic Architecture of the American Colonies and of the Early 
Republic, Fiske Kimball. Foremost architect and restorer of Williamsburg and 
Montjcello covers nearly 200 homes between 1620-1825. Architectural details, con- 
struction, style features, special fixtures, floor plans, etc. Generally considered finest 
work in its area. 219 illustrations of houses, doorways, window's, capital mantels. 
XX -h 314pp. 7^/8 X 103^. 21743-4 Paperbound $4,00 


Early American Rooms: 1650-1858, edited by Russell Hawes Kettell Tour of I’ 
rooms, each representative of a different era in American history and each furnished” 
decorated designed and occupied in the style of the era. 72 plans and elevations’ 
8-page color section, etc., show fabrics, wall papers, arrangements, etc. Full de- 
scriptive text, xvii + 200pp. of text. 83^ x 11 Vi. 

21633-0 Paperbound $*>.00 


1 HE Fitzwilliam Virginal Book, edited by J. Fuller Maitland and W. B. Squire, 
'u I modern printing of famous early 17th-century ms. volume of 300 works by 

Morley, Byrd, Bull, Gibbons, etc. For piano or other modern keyboard instrument- 
easy to read format, xxxvi ■+■ 938pp. 8% x 11 . 

21068-5, 21069-3 Two volumes, PapcrboundSlO.OO 


Keyboard Music, Johann Sebastian Bach. Bach Gesellschaft edition. A rich 
selection of Bachs masterpieces for the harpsichord: the six English Suites, six 

(Clav.erubung part I), the Goldberg Variations 
(Clavierubung part IV), the fifteen Two-Part Inventions and the fifteen Three-Part 

aMe vr.4- eminently play- 

.-Jble. VI + 312pp. 81/8 X 11. 22360-4 Paperbound $5.00 


Fhe Music of Bach: An Introduction, Charles Sanford Terry. A fine non- 
cchnical introduction to Bach’s music, both instrumental and vocal. Covers organ 

^nav;rinr Analyzes themes, developments. 

■nnovations. x + 114pp. 21075-8 Paperbound $ 1.50 

Symphonies. Sir George Grove. Noted British musi- 

rieofo history, analysis, commentary on symphonies. Very thorough 

"frmu'sLlT'"'''’ advanced student and amateur music lovx-r. 

436 musical passages, v.i + 407 pp. 20334-1 Paperbound $> 75 


CATALOGUE OF DOVER BOOKS 

Johann Sebastian Bach, Philipp Spitta. One of the great classics of musicology, 
this definitive analysis of Bach’s music (and life) has never been surpassed. Lucid, 
nontechnical analyses of hundreds of pieces (30 pages devoted to St. Matthew Pas- 
sion, 26 to B Minor Mass). Also includes major analysis of 18th-century music. 
450 musical examples. 40-page musical supplement. Total of xx -f- 1799pp. 

(EUK) 22278-0, 22279-9 Two volumes, Clothbound $17.50 

Mozart and His Piano Concertos, Cuthbert Girdlestone. The only full-length 
study of an important area of Mozart’s creativity. Provides detailed analyses of all 
23 concertos, traces inspirational sources. 417 musical examples. Second edition. 
509pp. 21271-8 Paperbound $3.50 

The Perfect Wagnerite: A Commentary on the Niblung’s Ring, George 
Bernard Shaw. Brilliant and still relevant criticism in remarkable essays on 
Wagner's Ring cycle, Shaw’s ideas on political and social ideology behind the 
plots, role of Leitmotifs, vocal requisites, etc. Prefaces, xxi -f- 136pp. 

(USO) 21707-8 Paperbound $1.75 


Don Giovanni, W. A. Mozart. Complete libretto, modern English translation; 

biographies of composer and librettist; accounts of early performances and critical 

reaction. Lavishly illustrated. All the material you need to understand and 

appreciate this great work. Dover Opera Guide and Libretto Series; translated 

and introduced by Ellen Bleiler. 92 illustrations. 209pp. 

21134-7 Paperbound $2.00 


Basic Electricity. U. S. Bureau of Naval Personel. Originally a training course, 
best non-technical coverage of basic theory of electricity and its applications. Funda- 
mental concepts, batteries, circuits, conductors and wiring techniques, AC and DC, 
inductance and capacitance, generators, motors, transformers, magnetic amplifiers, 
synchros, servomechanisms, etc. Also covers blue-prints, electrical diagrams, etc. 

Many questions, with answers. 349 illustrations, x H- 448pp. 6 V 2 x 9%. 

20973-3 Paperbound $3.50 


Reproduction of Sound, Edgar Villchur. Thorough coverage for laymen of 
high fidelity systems, reproducing systems in general, needles, amplifiers, preamps, 
loudspeakers, feedback, explaining physical background. ”A rare talent for making 

technicalities vividly comprehensible,” R. Darrell. Htgh Fidelity. 69 figures, 
.w _L 21515-6 Paperbound $1.35 


HEAR Me Talkin' to Ya; The Story of Jazz as Told by the Men Who 
Made It Nat Shapiro and Nat Hentoff. Louis Armstrong, Fats Waller, Jo Jones. 
Clarence Williams. Billy Holiday, Duke Ellington, Jelly Roll Morton and dozens 
of other jazz greats tell how it was in Chicago's South Si<le» New Orleans, depres- 

sion Harlem and the modern West Coast as jazz was born and grew, xvi + 429pp- 

21726-4 Paperbound $3.00 


Fables of Aesop, translated by Sir Roger L'Estrange. A reproduction of 

rare 1931 Paris edition; a selection of the most interesting fables, together with 50 

imaginative drawings by Alexander Calder. v + l28pp. 61/2x9^4 . 

® 21780-9 Paperbound $1-50 



CATALOGUE OF DOVER BOOKS 


Against the Grain (A Rebours), Joris K. Huysmans. Filled with weird images, 
evidences of a bi2arre imagination, exotic experiments with hallucinatory drugs, 
rich tastes and smells and the diversions of its sybarite hero Due Jean des Esseintes, 
this classic novel pushed 19th-century literary decadence to its limits. Full un- 
abridged edition. Do not confuse this with abridged editions generally sold. Intro- 
duction by Havelock Ellis, xlix + 206pp. 22190-3 Paperbound $2.50 

Variorum Shakespeare: Hamlet. Edited by Horace H. Furness; a landmark 
of American scholarship. Exhaustive footnotes and appendices treat all doubtful 
words and phrases, as well as suggested critical emendations throughout the play’s 
history. First volume contains editor’s own text, collated with all Quartos and 
Folios. Second volume contains full first Quarto, translations of Shakespeare's 
sources (Belleforest, and Saxo Grammaticus), Der Bestrafte Brudermord, and many 
essays on critical and historical points of interest by major authorities of past and 
present. Includes details of staging and costuming over the years. By far the 
best edition available for serious students of Shakespeare. Total of xx -f- 905pp. 

21004-9, 21005-7, 2 volumes, Paperbound $7.00 

A Life of William Shakespeare, Sir Sidney Lee. This is the standard life of 
Shakespeare, summarizing everything known about Shakespeare and his plays. 
Incredibly rich in material, broad in coverage, clear and judicious, it has served 
thousands as the best introduction to Shakespeare. 1931 edition. 9 plates, 
xxix H- 792pp. 21967-4 Paperbound $3.75 

Masters of the Drama, John Gassner. ^lost comprehensive history of the drama 
in print, covering every tradition from Greeks to jnodern Europe and America, 
including India, Far East, etc. Covers more than 800 dramatists, 2000 plays, with 
biographical material, plot summaries, theatre history, criticism, etc. "Best of its 
kind in English,” New Republic. 77 illustrations, xxii -J- 890pp. 

20100-7 Clothbound $10.00 

The Evolution of the English Language, George McKnight. The growth 
of English, from the 14th century to the present. Unusual, non-technical account 
presents basic information in very interesting form: sound shifts, change in grammar 
and syntax, vocabulary growth, similar topics. Abundantly illustratcd'Avith quota- 
tions. Formerly Modern English in the Making, xii -f- 590pp. 

21932-1 Paperbound $3.50 

An Etymological Dictionary of Modern English, Ernest Wcekley. Fullest, 
richest work of its sort, by foremost British lexicographer. Detailed word histories, 
including many colloquial and archaic words; extensive quotations. Do not con- 
fuse this with the Concise Etymological Dictionary, which is much abridged. Total 
of xxvii -h 830pp. <51/2 X 9Va- 

21873-2, 21874-0 Two volumes, Paperbound $7.90 

Flatland; a Romance of Many Dimensions, E. A. Abbott. Classic of 
science-fiction explores ramifications of life in a two-dimensional world, and what 
happens when a three-dimensional being intrudes. Amusing reading, but also use- 
ful as introduction to thought about hyperspace. Introduction by Banesh Hoffmann. 
16 illustrations, xx + 103pp. 20001-9 Paperbound $1.00 


CATALOGUE OF DOVER BOOKS 


Poems of Anne Bradstreet, edited with an introduction by Robert Hutchinson. 
A new selection of poems by America's first poet and perhaps the first significant 
woman poet in the English language. 48 poems display her development in works 
of considerable variety — love poems, domestic poems, religious meditations, formal 
elegies, ''quaternions," etc. Notes, bibliography, viii + 222pp. 

22160*1 Paperbound $2.50 

Three Gothic Novels: The Castle of Otranto by Horace Walpole; 
Vathek by William Beckford ; The Vampyre by John Polioori, with Frag- 
ment OF A Novel by Lord Byron, edited by E. F. Bleiler. The first Gothic 
novel, by Walpole; the finest Oriental tale in English, by Beckford; powerful 
Romantic supernatural story in versions by Polidori and Byron. All extremely 
important in history of literature; all still exciting, packed with supernatural 
thrills, ghosts, haunted castles, magic, etc. xl -f- 291pp. 

21232-7 Paperbound $2.50 

The Best Tales of Hoffmann, E. T. A. Hoffmann. 10 of Hoffmann’s most 
important stories, in modern re-editings of standard translations: Nutcracker and 
the King of Mice, Signor Formica, Automata, The Sandman, Rath Krespel, The 
Golden Flowerpot, Master hfartin the Cooper, The Mines of Falun, The King’s 
Betrothed, A New Year's Eve Adventure. 7 illustrations by Hoffmann. Edited 
by E. F. Bleiler. xxxix 4l9pp. 21793-0 Paperbound •‘?3.00 

Ghost and Horror Stories of Ambrose Bierce, Ambrose Bierce. 23 strikingly 
modern stories of the horrors latent in the human mind: The Eyes of the Panther, 
The Damned Thing, An Occurrence at Owl Creek Bridge, An Inhabitant of Carcosa, 
etc., plus the dream-essay. Visions of the Night. Edited by E. F. Bleiler. xxii 
+ 199pp. 20767-6 Paperbound $1.50 

Best Ghost Stories of J. S. LeFanu, J. Sheridan LeFanu. Finest stories by 
Victorian master often considered greatest supernatural writer of all. Carmilla, 
Green Tea, The Haunted Baronet. The Familiar, and 12 others. Most never before 
available in the U. S. A. Edited by E. F. Bleiler. 8 illustrations from Victorian 
publications, xvii 467pp. 20415-4 Paperbound $3.00 

Mathematical Foundations of Information Theory. A. I. Khinchin. Com- 
prehensive introduction to work of Shannon, ^fch'lillan, Feinstein and Khinchin, 
placing these investigations on a rigorous mathematical basis. Covers entropy 
concept in probability theory, uniqueness theorem. Shannon’s inequality, ergoJic 
sources, the E property, martingale concept, noise, Feinstein's fundamental lemma. 
Shanon’s first and second theorems. Translated by R. A. Silverman and M. D. 
Friedman, iii -f- 120pp. 60434-9 Paperbound $ 2.00 


Seven Science Fiction Novels. H. G. Wells. The standard collection of the 
great novels. Complete, unabridged. F/'rj/ Aie// hi the AXoon. Island of Dr. Aioreau, 
IVar of the Worlds, Food of the Gods, Invisible Man, Time AXachine, In the Days 
of the Comet. Not only science fiction fans, but every educated person owes it to 
himself to read these novels. 1015pp. (USO) 20264-X Clothbound S6.00 



CATALOGUE OF DOVER BOOKS 


Last and First Men and Star Maker. Two Science Fiction Novels. Olaf 
Stapledon. Greatest future histories in science fiction. In the first, human intelli- 
gence is the "hero/' through strange paths of evolution, interplanetary invasions, 
incredible technologies, near extinctions and reemergcnces. Star Maker describes the 
quest of a band of star rovers for intelligence itself, through time and space: weird 
inhuman civilizations, crustacean minds, symbiotic worlds, etc. Complete, un- 
abridged. V -h 438pp. (USO) 21962-3 Paperbound $2.50 

Three Prophetic Novels, H. G. ^t'^ELLS. Stages of a consistently planned future 
for mankind. IP' hen the Sleeper W etkes, and A Story of the Days to Come, anticipate 
Brave New World and 1984, in the 21st Century; The Time Machine, only com- 
plete version in print, shows farther future and the end of mankind. All show 
Wells's greatest gifts as storyteller and novelist. Edited by E. F. Bleiler. x 
4- 335pp. (USO) 20605-X Paperbound $2.50 


The Devil's Dictionarv, Ambrose Bierce. America's own Oscar Wilde — 
Ambrose Bierce — offers his barbed iconoclastic wisdom in over 1,000 definitions 
hailed by H. L. Mencken as “some of the most gorgeous witticisms in the English 
language." I45pp. 20487-1 Paperbound $1.25 


Max and Moritz, Wilhelm Busch. Great children's classic, father of comic 
strip, of two bad boys. Max and Moritz. Also Ker and Plunk (Plisch und Plumm), 
Cat and Mouse, Deceitful Henry, Ice-Peter, The Boy and the Pipe, and five other 
pieces. Original German, with English translation. Edited by H. Arthur Klein; 
translations by various hands and H. Arthur Klein, vi + 2l6pp. 

20181-3 Paperbound $ 2.00 

Pigs is Pigs and Other Favorites, Ellis Parker Butler. The title story is one 
of the best humor short stories, as Mike Flannery obfuscates biology and English. 
Also included, lhat Pup of Murchison's, The Great American Pie Company, and 
Perkins of Portland. 14 illustrations, v + 109pp. 21532-6 Paperbound $1.25 


The Peterkin Papers, Lucretia P. Hale. It takes genius to be as stupidly mad as 
the Peterkins, as they decide to become wise, celebrate the “Fourth,'' keep a cow, 
and otherwise strain the resources of the Lady from Philadelphia. Basic book of 
American humor. 153 illustrations. 219pp. 20794-3 Paperbound $2.00 


Perrault's Fairy Tales, translated by A. E. Johnson and S. R. Littlewood. with 
34 full-page illustrations by Gustave Dore. All the original Pcrrault stories — 
Cinderella, Sleeping Beauty, Bluebeard. Little Red Riding Hood, Puss in Boots, Tom 
Thumb, etc. — with their witty verse morals and the magnificent illustrations of 
Dorc. One of the five or six great books of European fairy tales, viii -|- U7pp. 
^Vexll. 22311-6 Paperbound $ 2.00 


Old Hungarian Fairy Tales, Barones.s Orezy. Favorites translated and adapted 
by auUior of the Scarlet Pimpernel. Eight fairy tales include “The Suitors of Princess 
Fire-Fly, The Twin Hunchbacks." “Mr. Cuttlefish’s Love Stor>',“ and 'The 
Enchanted Cat." This little volume of magic and adventure will captivate children 
as It has for generations. 90 drawings by Montagu Barstow. 96pp. 

(USO) 22293-4 Paperbound $1.95 


CATALOGUE OF DOVER BOOKS 


Book, Andrew Lang. Lang's color fairy books have long been 
children's favorites. This volume includes Rapunzel, Jack and the Bean-stalk and 
35 other stories, familiar and unfamiliar. 4 plates, 93 illustrations x + 367pp. 

21673-X Paperbound $2.50 

The Blue Fairy Book, Andrew Lang. Lang's tales come from all countries and all 
times. Here are 37 tales from Grimm, the Arabian Nights, Greek Mythology, and 
other fascinating sources. 8 plates, 130 illustrations, xi + 390pp. 

21437-0 Paperbound $2.50 

Household Stories by the Brothers Grimm. Classic English-language edition 
of the well-known tales — Rumpclstiltskin, Snow White, Hansel and Gretel The 
Twelve Brothers. Faithful John. Rapunzel, Tom Thumb (52 stories in all), "frans- 
latcd into simple, straightforward English by Lucy Crane. Ornamented with head- 
pieces, vignettes, elaborate decorative initials and a dozen full-page illustrations by 
Walter Crane, x -f- 269pp. 21080-4 Paperbound $2.00 

■The Merry Adventures of Robin Hood, Howard Pyle. The finest modern ver- 
sions of the traditional ballads and tales about the great English outlaw. Howard 
Pyle's complete prose version, with every word, every illustration of the first edition. 
Do not confuse this facsimile of the original (1883) with modern editions that 
change text or illustrations. 23 plates plus many page decorations, xxii 296pp. 

22043-5 Paperbound $2.50 

The Story of King Arthur and His Knights, Howard Pyle. The finest chil- 
dren's version of the life of King Arthur; brilliantly retold by Pyle, with 48 of his 
most imaginative illustrations, xviii 313pp. 6V^ x 9^^. 

21445-1 Paperbound $2.50 

The Wonderful Wizard of Oz, L. Frank Baum. America's finest children's 
book in facsimile of first edition with all Denslow illustrations in full color. The 
edition a child should have. Introduction by Martin Gardner. 23 color plates, 
scores of drawings, iv 267pp. 20691-2 Paperbound $2.50 

The Marvelous Land of Oz, L. Frank Baum. The second Oz book, every bit as 
imaginative as the Wizard. The hero is a boy named Tip, but the Scarecrow and the 
Tin Woodman are back, as is the Oz magic. 16 color plates, 120 drawings by John 
R. Neill. 287pp. 20692-0 Paperbound $2.50 

The Magical Monarch of Mo, L. Frank Baum. Remarkable adventures in a land 
even stranger than Oz. The best of Baum's books not in the Oz series. 15 color 
plates and dozens of drawings by Frank Verbeck. xviii -f- 237pp. 

21892-9 Paperbound $2.25 

The Bad Child's Book of' Beasts, More Beasts for Worse Children, A 
Moral Alphabet, Hilaire Belloc. 'Ihree complete humor classics in one volume. 

Be kind to the frog, and do not call him names . . . and 28 other whimsical animals. 
Familiar favorites and some not so well known. Illustrated by Basil Blackwell. 
156pp. (USO) 20749-8 Paperbound ,$1.50 



CATALOGUE OF DOVER BOOKS 


East O’ the Sun and West O’ the Moon, George W. Dasent. Considered the 
best of all translations of these Norwegian folk tales, this collection has been enjoyed 
by generations of children (and folklorists too) . Includes True and Untrue, Why the 
Sea is Salt, East O’ the Sun and West O’ the Moon, Why the Bear is Stumpy-Tailed, 
Boots and the Troll, The Cock and the Hen, Rich Peter the Pedlar, and 52 more. 
The only edition with all 59 tales. 77 illustrations by Erik Werenskiold and Theodor 
Kittelsen. xv -|- 4l8pp. 22521-6 Paperbound $3.50 

Goops and How to be Them, Gelett Burgess. Classic of tongue-in-cheek humor, 
masquerading as etiquette book. 87 verses, twice as many cartoons, show mis- 
chievous Goops as they demonstrate to children virtues of table manners, neatness, 
courtesy, etc. Favorite for generations, viii -|- 88 pp. 6^/2 x 9 

22233-0 Paperbound $1.25 

Alice’s Adventures Under Ground, Lewis Carroll. The first version, quite 
different from the final Alice in \X''onderland, printed out by Carroll himself with 
his own illustrations. Complete facsimile of the ’’million dollar” manuscript Carroll 
gave to Alice Liddell in 1864. Introduction by Martin Gardner, viii -j- 96pp. Title 
and dedication pages in color. 21482-6 Paperbound $1.25 

The Brownies, Their Book, Palmer Cox. Small as mice, cunning as foxes, exu- 
berant and full of mischief, the Brownies go to the zoo, toy shop, seashore, circus, 
etc., in 24 verse adventures and 266 illustrations. Long a favorite, since their first 
appearance in St. Nicholas Magazine, xi -j- l44pp. 6 ^ x 9V^. 

21265-3 Paperbound $1.75 

Songs of Childhood, Walter De La Mare. Published (under the pseudonym 
Walter Ramal) when De La Mare was only 29, this charming collection has long 
been a favorite children’s book. A facsimile of the first edition in paper, the 47 poems 
capture the simplicity of the nursery rhyme and the ballad, including such lyrics as 
I Met Eve, Tartary, 'The Silver Penny, vii 4 " 106pp. (USO) 21972-0 Paperbound 

$1.25 

The Complete Nonsense op Edward Lear, Edward Lear. The finest 19th-century 
humorist-cartoonist in full: all nonsense limericks, zany alphabets. Owl and Pussy- 
cat, songs, nonsense botany, and more than 500 illustrations by Lear himself. Edited 
by Holbrook Jackson, xxix + 287pp. (USO) 20167-8 Paperbound $ 2.00 

Billy Whiskers: The Autobiography of a Goat, Frances Trego Afontgomery. 
A favorite of children since the early 20th century, here are the escapades of that 
rambunctious, irresistible and mischievous goat — Billy Whiskers. Much in the 
Feck's Bad Boy, this is a book that children never tire of reading or hearing. 
All the original familiar illustrations by W. H. Fry are included: 6 color plates, 
18 black and white drawings. I59pp. 22345-0 Paperbound $ 2.00 

Mother Goose Melodies. Faithful rcpublication of the fabultjusly rare Munroe 
and Francis copyright 1833” Boston c-dition — the most important Alothcr Goose 
collection, usually referred to as the ’’original.” Familiar rhymes plus many rare 
ones, with wonderful old woodcut illustrations. Edite<.l by E. F. Bleiler. 128pp. 
X 6^. 2 2577-1 Paperbound $1 .00 


CATALOGUE OF DOVER BOOKS 


Two Little Savages/ Being the Adventures of Two Boys Who Lived M 
Indians and What They Learned, Ernest Thompson Seton. Great classic cff 
nature and boyhood provides a vast range of woodlore in most palatable form, q 
genuinely entertaining story. Two farm boys build a teepee in woods and live in it 
for a month, working out Indian solutions to living problems, star lore, birds and 
animals, plants, etc. 293 illustrations, vii + 286pp. 

20985-7 Paperbound $2.5^ 

i 

Peter Piper’s Practical Principles of Plain & Perfect Pronunciation. 
Alliterative jingles and tongue-twisters of surprising charm, that made their first 
appearance in America about 1830. Republished in full w’ith the spirited woodcut 
illustrations from this earliest American edition. 32 pp. 4 I /2 x 6^. 

22560-7 Paperbound $ 1.00 


Science Experiments and Amusements for Children, Charles Vivian. 73 easy 
experiments, requiring only materials found at home or easily available, such as 
candles, coins, steel wool, etc.; illustrate basic phenomena like vacuum, simple 
chemical reaction, etc. All safe. Modern, well-planned. Formerly Sciettce Games 
for Children. 102 photos, numerous drawings. 96pp. 6}/^ x 9V^. 

21856-2 Paperbound $1.25 

An Introduction to Chess Moves and Tactics Simply Explained, Leonard 
Barden. Informal intermediate introduction, quite strong in explaining reasons for 
moves. Covers basic material, tactics, important openings, traps, positional play in 
middle game, end game. Attempts to isolate patterns and recurrent configurations. 
Formerly C/.?ejj. 58 figures. 102pp. (USO) 21210-6 Paperbound $1.25 


Lasker's Manual of Chess, E>r. Emanuel Lasker. Lasker w.is not only one of the 
five great World Champions, he was also one of the ablest expositors, theorists, and 
analysts. In many ways, his Manu.il, permeated with his philosophy of battle, filled 
with keen insights, is one of the greatest works ever written on chess. Filled with 
analyzed games by the great players. A single-volume library that will profit almost 
any chess player, beginner or master. 308 diagrams, xli x 349pp. 

20640-8 Paperbound $2.75 

The Master Book of Mathematical Recreations, Fred Schuh. In opinion of 
many the finest work ever prepared on mathematical puzzles, stunts, recreations; 
exhaustively thorough explanations of juathematics involved, analysis of effects, 
citation of puzzles and games. Mathematics involved is elementary. Translated bv 
F. Gbbel. 194 figures, xxiv -t- 430pp. 22134-2 Paperbound S3. 50 


Mathematics. Magic and Mysters , Martin Gardner. Puzzle editor for Scientific 
American explains mathematics behintl various mystifying tricks: card tricks, slag* 
■'mind reading,” coin and match tricks, counting out games, geometric disscctiont 
etc. Probability sets, theory cjf numbers clearly explained. Also provides more tha| 

400 tricks, guaranteed * 135 illustrations, xii -f- I76pp. f 

20335-2 Paperbound $l.lf' 


uiim 1 




il LIBRARY 


1 


CATALOGUE OF DOVER BOOKS 


' Mathematical Puzzles for Beginners and Enthusiasts. Geoffrey Mott-Smith. 
89 puzzles from easy to difficult — involving arithmetic, logic, algebra, properties 
f digits, probability, etc. — for enjoyment and mental stimulus. Explanation of 
lathematicai principles behind the puzzles. 135 illustrations, viii -}- 248pp. 

20198-8 Paperbound $1.75 

I Paper Folding for Beginners, William D. Murray and Francis J. Rigney. Easiest 
jook on the market, clearest instructions on making interesting, beautiful origami. 
i Sail boats, cups, roosters, frogs that move legs, bonbon boxes, standing birds, etc. 
AO projects; more than 27 5 diagrams and photographs. 94pp. 

20713-7 Paperbound $1.00 

Tricks and Games on the Pool Table, Fred Herrmann. 79 tricks and games — 
some solitaires, some for two or more players, some competitive games — to entertain 
you betv.’een formal games. Mystifying shots and throws, unusual caroms, tricks 
involving such props as cork, coins, a hat, etc. Formerly Fufi on the Pool Table. 
77 figures. 95pp. 21814-7 Paperbound $1.25 

Hand Shadows to be Thrown Upon the Wall: A Series of Novel and 
Amusing Figures Formed by the Hand, Henry Bursill. Delightful picturebook 
from great-grandfather’s day shows how to make 18 different hand shadows: a bird 
that flies, duck that quacks, dog that wags his tail, camel, goose, deer, boy, turtle, 
etc. Only book of its sort, vi 33pp. 6 Y 2 x 9V^. 21779-5 Paperbound $1.00 

Whittling and Woodcarving, E. J. Tangerman. 18th printing of best book on 
market. "If you can cut a potato you can carve” toys and puzzles, chains, chessmen, 
caricatures, masks, frames, woodcut blocks, surface patterns, much more. Information 
on tools, woods, techniques. Also goes into serious wood sculpture from Middle 
Ages to present, East and West. 464 photos, figures, x 293pp. 

20965-2 Paperbound $2.00 

History of Philosophy, Julian Marias. Possibly the clearest, most easily followed, 
best planned, most useful one-volume history of philosophy on the market; neither 
skimpy nor overfull. Full details on system of every major philosopher and dozens 
of less important thinkers from pre-Socratics up to Existentialism and later. Strong 
on many European figures usually omitted. Has gone through dozens of editions in 
Europe. 1966 edition, translated by Stanley Appelbaum and Clarence Strowbridge. 
xviii -b 505pp. 21739-6 Paperbound $3.50 


Yoga: A Scientific Evaluation, Kovoor T. Behanan. Scientific but non-technicaj 
study of physiological results of yoga exercises; done under auspices of Yale U. 
Relations to Indian thought, to psychoanalysis, etc. I 6 photos, xxiii -f- 270pp. 

20505-3 Paperbound $2.50 


Prices subject to change without notice. 

\vailable at your book dealer or write fi>r free catalogue to Dept. Gf, Dover 
ubiications, Inc., 180 Varick St., N. Y., N. Y. 10014. Dover publishes mcire than 
50 books each year on science, elementary and advanced mathem.ttics. biology, 
jusic, art, literary* history, social sciences and other are-as. 


Title 








Title 



Title 



